# Morphometric Analysis

## Morphometrics and physical markers

Morphometrics (morpho– shape; metrics– measurements) is the use of physical measurements to determine the relatedness of organisms. With extinct organisms that have died out long ago, DNA extraction proves to be difficult. Likewise, prior to DNA technologies to analyze species, Linnean taxonomy was ascribed to organisms based on similarities in features.

### Describing Species and Variation of Morphologies

Below are images of skull landmarks of the lizard family Varanidae. This family includes monitor lizards and Komodo Dragons.As can be seen below, the general morphology of the skulls are similar enough that they all retain the same landmarks. The figure below also illustrates the diversity in these lizards that illustrate a large variety between species.

### Landmarks Standardize measurements

Having a set of shared landmarks provides the opportunity to make systematic measurements of morphometric features.

### Euclidean distance to measure relatedness

Euclidean distance is a measurement derived from Pythagorean geometry that describes the shortest distance (d) between 2 points (A & B) as a straight line using triangulation. In a cartesian space, the points can be defined:

$A=(x_A, y_A)$ and $B=(x_B, y_B)$

Standard pythagorean theorem can be expressed as:

$x^2 + y^2 = d^2$

To find the distance between the 2 points, we utilize algebra to calculate for $d$.

$d = \sqrt{x^2 + y^2}$

In this case, we expand to comparing the coordinates of the two points:

$\Delta x = x_B - x_A$ and $\Delta y = y_B - y_A$

We can then expand this idea to include the differences of data points that describe the comparisons of multiple measurements.

$d(\mathbf{X_i, X_j}) = \sqrt{\sum_{k=1}^{p}(X_{ik} - X_{jk})^2}\$

#### Calculating distance with R

1. Download the dataset (McCurry et al. 2015) associated with this activity (a Comma Separated Value .csv file). This can be used in a spreadsheet or in a text editor. This data can be imported into R to determine the euclidean distances of landmarks.
2. The following code in R will download the data set into a variable called “varanoid”, measure euclidean distance and save a plot into a PDF file in a directory called “/tmp”.
## install curl for fetching from internet if it isn't
install.packages('curl')
## Load the curl library
library(curl)
## read the data of measurements and assign it to a variable 'varanoid'
## set the row names to the Species column
row.names(varanoid) = varanoid\$Species
## remove the first column of the table to have purely numeric data
varanoid_truncated = (varanoid[,2:14])
## calculate distance using euclidean as the method
dist_measure = dist(varanoid_truncated, method='euclidean')
## display dist_measure to look at the comparisons
dist_measure
varanoid_cluster = hclust(dist_measure)
## open PDF as a graphics device  to save a file in the '/tmp' directory
pdf(file='/tmp/varanoid_tree.pdf')
plot(varanoid_cluster)
dev.off()
## close the device to save the plot as pdf


## DNA Analysis

Before starting this activity, review bioinformatics and sequence analysis.

1. Search NCBI for mitochondrial sequences from the species involved in McCurry 2015. The data has been submitted by Ast (2001).
2. Find the sequences and identify/extract elements that are common to all
3. Assemble the shared sequences in a text editor as a single FASTA file where each species is separated by a header (“>Species A”)