Encyclopedia > Phylogenetic tree
Fig. 1: A speculatively rooted tree for rRNA genes

A phylogenetic tree, also called an evolutionary tree, is a tree showing the evolutionary interrelationships among various species or other entities that are believed to have a common ancestor. In a phylogenetic tree, each node with descendants represents the most recent common ancestor of the descendants, with edge lengths sometimes corresponding to time estimates. Each node in a phylogenetic tree is called a taxonomic unit. Internal nodes are generally referred to as Hypothetical Taxonomic Units (HTUs) as they cannot be directly observed. Image File history File links Phylogenetic_tree. ... Image File history File links Phylogenetic_tree. ... A labeled tree with 6 vertices and 5 edges In graph theory, a tree is a graph in which any two vertices are connected by exactly one path. ... This article is about evolution in biology. ... The hierarchy of scientific classification. ... A group of organisms is said to have common descent if they have a common ancestor. ... The most recent common ancestor (MRCA) of any set of organisms is the most recent individual from which all organisms in the group are directly descended. ... A pocket watch, a device used to tell time Look up time in Wiktionary, the free dictionary. ...

## Types of phylogenetic trees

Fig. 1: Unrooted tree of the myosin supergene family[1]
Fig. 2: A highly resolved, automatically generated Tree Of Life, based on completely sequenced genomes [2][3].
A phylogenetic tree, showing how Eukaryota and Archaea are more closely related to each other than to Bacteria, based on Cavalier-Smith's theory of bacterial evolution.

Unrooted trees illustrate the relatedness of the leaf nodes without making assumptions about common ancestry. While unrooted trees can always be generated from rooted ones by simply omitting the root, a root cannot be inferred from an unrooted tree without some means of identifying ancestry; this is normally done by including an outgroup in the input data or introducing additional assumptions about the relative rates of evolution on each branch, such as an application of the molecular clock hypothesis. Figure 1 depicts an unrooted phylogenetic tree for myosin, a superfamily of proteins.[4] The molecular clock (based on the molecular clock hypothesis (MCH)) is a technique in genetics, which researchers use to date when two species diverged. ... Myosin is a motor protein filament found in muscle tissue. ... A gene family is a set of genes defined by presumed homology, i. ... A representation of the 3D structure of myoglobin, showing coloured alpha helices. ...

Both rooted and unrooted phylogenetic trees can be either bifurcating or multifurcating, and either labeled or unlabeled. A bifurcating tree has a maximum of two descendants arising from each interior node, while a multifurcating tree may have more than two. A labeled tree has specific values assigned to its leaves, while an unlabeled tree, sometimes called a tree shape, only defines a topology. The number of possible trees for a given number of leaf nodes depends on the specific type of tree, but there are always more multifurcating than bifurcating trees, more labeled than unlabeled trees, and more rooted than unrooted trees. The last distinction is the most biologically relevant; it arises because there are many places on an unrooted tree to put the root. For labeled bifurcating trees, there are It has been suggested that Parent node, Internal node, Root node and Subtree be merged into this article or section. ...

$frac{(2n-3)!}{2^{n-2}(n-2)!}$

total rooted trees and

$frac{(2n-5)!}{2^{n-3}(n-3)!}$

total unrooted trees, where n represents the number of leaf nodes. The number of unrooted trees for n input sequences or species is equal to the number of rooted trees for n-1 sequences.[5]

A dendrogram is a broad term for the diagrammatic representation of a phylogenetic tree.

A cladogram is a tree formed using cladistic methods. This type of tree only represents a branching pattern, i.e., its branch lengths do not represent time. It has been suggested that Clade be merged into this article or section. ...

A phylogram is a phylogenetic tree that explicitly represents number of character changes through its branch lengths.

An ultrametric tree or chronogram is a phylogenetic tree that explicitly represents evolutionary time through its branch lengths.

## Phylogenetic tree construction

Phylogenetic trees among a nontrivial number of input sequences are constructed using computational phylogenetics methods. Distance-matrix methods such as neighbor-joining or UPGMA, which calculate genetic distance from multiple sequence alignments, are simplest to implement, but do not invoke an evolutionary model. Many sequence alignment methods such as ClustalW produce both sequence alignments and phylogenetic trees. Methods including maximum parsimony, maximum likelihood and Bayesian inference apply an explicit model of evolution to phylogenetics.[5] Identifying the optimal tree using many of these techniques is NP-hard[5], so heuristic search and optimization methods are used in combination with tree-scoring functions to identify a reasonably good tree that fits the data. Computational phylogenetics is the study of computational algorithms, methods and computer programs for use in phylogenetic analyses. ... Computational phylogenetics is the study of computational algorithms, methods and computer programs for use in phylogenetic analyses. ... In bioinformatics, neighbor-joining is a bottom-up clustering method used for the creation of phylogenetic trees. ... UPGMA (Unweighted Pair Group Method with Arithmetic mean) is a simple bottom-up data clustering method used in bioinformatics for the creation of phylogenetic trees. ... Genetic distance is a measure of the disimilarity of genetic material between different species or individuals of the same species. ... First 90 positions of a protein multiple sequence alignment of instances of the acidic ribosomal protein P0 (L10E) from several organisms. ... Clustal is a widely used multiple sequence alignment computer program. ... // Maximum parsimony, often simply referred to as parsimony, is a commonly used, non-parametric statistical method for estimating phylogenies. ... Maximum likelihood estimation (MLE) is a popular statistical method used to make inferences about parameters of the underlying probability distribution from a given data set. ... Bayesian inference is statistical inference in which evidence or observations are used to update or to newly infer the probability that a hypothesis may be true. ... In computational complexity theory, NP-hard (Non-deterministic Polynomial-time hard) refers to the class of decision problems that contains all problems H such that for all decision problems L in NP there is a polynomial-time many-one reduction to H. Informally this class can be described as containing... Look up Heuristic in Wiktionary, the free dictionary. ... In mathematics, the term optimization, or mathematical programming, refers to the study of problems in which one seeks to minimize or maximize a real function by systematically choosing the values of real or integer variables from within an allowed set. ...

Tree-building methods can be assessed on the basis of several criteria:[6]

• efficiency (how long does it take to compute the answer, how much memory does it need?)
• power (does it make good use of the data, or is information being wasted?)
• consistency (will it converge on the same answer repeatedly, if each time given different data for the same model problem?)
• robustness (does it cope well with violations of the assumptions of the underlying model?)
• falsifiability (does it alert us when it is not good to use, i.e. when assumptions are violated?)

Tree-building techniques have also gained the attention of mathematicians. Trees can also be built using T-theory. [7] T-theory is a branch of discrete mathematics dealing with analysis of trees and discrete metric spaces. ...

## Limitations of phylogenetic trees

Furthermore, basing the analysis on a single gene or protein taken from a group of species can be problematic because such trees constructed from another unrelated gene or protein sequence often differ from the first, and therefore great care is needed in inferring phylogenetic relationships amongst species. This is most true of genetic material that is subject to lateral gene transfer and recombination, where different haplotype blocks can have different histories. For a non-technical introduction to the topic, see Introduction to Genetics. ... A representation of the 3D structure of myoglobin, showing coloured alpha helices. ... Recombination usually refers to the biological process of genetic recombination and meiosis, a genetic event that occurs during the formation of sperm and egg cells. ... A haplotype is the genetic constitution of an individual chromosome. ...

When extinct species are included in a tree, they should always be terminal nodes, as it is unlikely that they are direct ancestors of any extant species. Scepticism must apply when extinct species are included in trees that are wholly or partly based on DNA sequence data, due to evidence that "ancient DNA" is not preserved intact for longer than 100,000 years.[citation needed] 9, 14, 19, 67 and 76 are leaf nodes. ... Ancient DNA can be loosely described as any DNA recovered from biological samples that have not been preserved specifically for later DNA analyses. ...

## References

