Cladistics – An Exercise in Phylogenetic Relationships
Phylogeny is the evolutionary history of a group of interrelated species. Phylogenetic methods infer evolutionary relationships from molecular, morphological, and/or behavioral data, presenting these relationships in the form of a tree diagram.
A phylogenetic tree is composed of nodes and branches. Terminal nodes (the tips of branches) represent living taxa, internal nodes (where two branches meet) represent last common ancestors and inferred ancestral morphotypes, and branches define the relationships between the nodes. You can rotate bifurcating branches at any node without changing the relationships between taxa!
The principles of cladistics, originally called phylogenetic systematics, are based on the belief that one must discriminate between ‘true’ similarity, homology, and superficial and misleading similarity, analogy, in order to accurately reconstruct phylogenetic relationships. (A classic example of an analogy is the similarity between dolphins and sharks – a shark is a fish, and a dolphin is a mammal, but they have both adapted to aquatic conditions. They are not very closely related, but one must look beyond superficial similarity to realize this.)
To discuss the concepts of homology and analogy, we must first define two essential concepts used in cladistic analysis: character and character state. In fact, we have already used these concepts in the above example.
A character can be any recognizable feature, for example, a location on a DNA strand; a morphological trait, such as enamel thickness; or a behavioral trait, such as locomotor pattern.
A character state is the form each character takes. In the above three examples, character states could be: the presence of Adenine (A) at position three of a sequence of DNA; the presence of thin enamel; or being bipedal. A character must have two or more states. To determine the direction of evolutionary polarity of character states (how they changed through time) we can root the cladogram using the outgroup method. An outgroup is a taxon that is equally distantly related to all members of the group we are interested in, the ingroup. All character states shared between the outgroup and the ingroup are shared primitive characters.
Once the characters have been defined, and the character states identified, a tree or cladogram can be constructed. The basic process involves grouping species with similar character states together. In contrast to phenetics (a seldom used method based on evaluation of overall similarity), in cladistics, we must partition similarity into three components: (1) that due to analogy (called homoplasy in cladistic analysis); (2) the sharing of primitive homologies; and, (3) the sharing of derived homologies.
Homoplasy occurs in characters of taxa (a species or a group of related species; singular form: taxon) that have evolved independently, leading to parallel or convergent character states for a particular character. These characters are not used in cladistic analysis. Shared primitive homologies are similarities that are inherited from a distant common ancestor. These characters are also not useful in cladistic analysis. Characters states found only in one taxon are uniquely derived characters and are also not used in cladistics. After all, how could something unique in one organism tell you about another organism or their relationship?
Shared, derived homologies are the most important similarities in cladistic analysis. These are traits found in a group’s most recent common ancestor, but not in more distant ancestors, which still display the primitive character states. Only shared, derived traits can be used to define clades (i.e., groups that contain a common ancestor and all its descendants).
To implement these cladistic concepts, we use another principle, that of maximum parsimony. The principle of maximum parsimony is based on the concept that the hypothesis with the fewest assumptions best fits the data at hand. In evolutionary terms, this means that the phylogenetic hypothesis (a tree) with the fewest number of changes is the best or most parsimonious. By finding the evolutionary tree with the fewest number of changes on it, we automatically minimize the number of homoplasies or characters that require additional explanations as to how they evolved. This has the fortuitous effect of maximizing the number of shared derived traits, which define the groups we are interested in discovering. Since shared derived traits are the only type of traits cladistic analyses should use, the tree with the greatest number of shared derived traits (and thus fewest homoplasies) is the phylogenetic tree that best represents the data.