PalAss Home > PalAss Newsletter > Cladistics for Palaeontologists - Introduction
Cladistics for Palaeontologists - Introduction


1. Introduction

Written by Peter Forey - The Natural History Museum, London, UK (email: plf@nhm.ac.uk). This article first appeared in the Nº 60 edition of Palaeontology Newletter.

Background

Cladistics was introduced by the German entomologist Willi Hennig, who put forward his ideas in 1950. He wrote in his native language, so these were completely ignored until 1966 when an English translation of a manuscript was published under the title “Phylogenetic Systematics” (Hennig 1966). It is not an easy book to read but fortunately many others have been written that have both fleshed out and distorted his ideas. Hennig’s most important contribution was to offer a precise definition of biological relationship and to suggest how that relationship might be discovered.

Taxon and character relationship

Hennig’s concept of relationship is illustrated in Figure 1. Considering three taxa, then the salmon and the lizard are more closely related to each other than either is to the shark. This is so because the salmon and the lizard share a common ancestor, ‘x’, which lived at time t2 and which is not shared with the shark or any other taxon. Similarly the shark is more closely related to a group ‘salmon+lizard’ because the shark, salmon and lizard together share a unique common ancestor – ‘y’, which lived at an earlier time t1. The salmon and lizard are called sister-groups; the shark is the sister-group of the combined group salmon+lizard. By extension, the lamprey is the sister-group of shark+salmon+lizard. The aim of cladistic analysis is to discover this sister-group hierarchy, and express the results in branching diagrams. These diagrams are called cladograms, a reference to the fact that they purport to express the genealogical units or clades (the word ‘Cladistics’ was, ironically, coined by Ernst Mayr – a life-long opponent of cladistic classification). The aim of cladistics is to search for the sister-group, and the concept of two taxa being more closely related to each other than either is to a third (the three-taxon statement) is fundamental to cladistics.

Figure 1

Figure 1. Hennig’s concept of relationships among taxa A – D. See text for discussion.

Sister-groups are discovered by identifying characters (or character states) that are uniquely shared by two of the three groups under consideration. But not just any characters (or character states – we will deal with the relation between character and character state in the next article).Hennig made a distinction between two types of characters (or character states) and this distinction depended on where they occurred in the phylogenetic history of a particular group. The character or the state of the character which occurs in the ancestral morphotype he called “plesiomorphic” (near to the ancestral morphology), and the derived character, or the derived state, he called “apomorphic” (away from the ancestral morphology). Here, it is only necessary to emphasise that the terms apomorphic and plesiomorphic are relative terms – relative to a particular systematic problem. In Figure 2A character state “a” is plesiomorphic and “a prime” is apomorphic. State “a prime” is presumed to have been present in the ancestral morphotype which gave rise to taxa B and C. The presence of character “a prime” – the apomorphic state – in taxa B and C is evidence of their immediate common ancestry and their sister-group relationship. “a prime” is a shared apomorphy or a synapomorphy suggesting that taxa B and C are more closely related to each other than either is to A. In Figure 2B “a prime” is apomorphic with respect to “a” but it is plesiomorphic with respect to “a double prime”. So, just as the relationship of taxa is relative, so is the relationship of characters (or character states).

Figure 2.

Figure 2. Hennig’s ideas of relationships between character states. See text for discussion.

Hennig thought that you could decide which was the apomorphic state and which was plesiomorphic before you did the analysis. He had several criteria for this of which stratigraphic order was the most relevant to us – the state of a character that occurs earlier in the fossil record is to be regarded as the plesiomorphic state. This did not go down very well with neontologists, nor with many palaeontologists, because it relied on the faithfulness of the fossil record to document the truth. Today, there are two criteria that are used: the outgroup and ontogenetic sequence, both of which we will explore in the next article.

Hennig introduced a third state that he called autapomorphic. This is the state that occurs in only one of the taxa under consideration. And once again, autapomorphic characters in one analysis may be synapomorphies in another.

A real example is given in Figure 3. In Figure 3 characters numbered 3 and 4 are synapomorphies suggesting that the lizard and the salmon shared a unique common ancestor ‘Z’. It suggests that characters 3 and 4 arose in ancestor ‘Z’ and were inherited by the salmon and the lizard. Shared primitive characters (symplesiomorphies) are characters inherited from a more remote ancestry and are irrelevant to the problem of relationship of the lizard and the salmon. For example, the shared possession of characters 1 and 2 in the salmon and lizard would not imply that they shared a unique common ancestor because these attributes are also found in the shark. Characters 1 and 2 may be useful at a more inclusive hierarchical level to suggest common ancestry at ‘Y’. With respect to the three-taxon problem (shark, salmon and lizard) then characters 1 and 2 are symplesiomorphies and they suggest nothing other than that the shark, salmon and lizard are a group. Similarly, characters 5 – 9 and 10 – 12 are autapomorphies and irrelevant to discovering relationships since they are each found in only one of the taxa. Sister-groups are discovered by identifying shared derived apomorphic characters (synapomorphies) inferred to have originated in the latest common ancestor and shared by descendants. These synapomorphies can be thought of as evolutionary homologies: that is, as structures inherited from the immediate common ancestor.

Figure 3.

Figure 3. An example of a phylogeny showing characters by which taxa are recognised. Characters 1 – 4 are synapomorphies, 5 – 12 are autapomorphies and 13 is an attribute seen in the salmon and the shark. See text for discussion.

Another way we can think of this is to ask the question “what groups are specified by what characters?” In Figure 3 given four taxa, of (initially) unknown interrelationships, then characters 1 and 2 suggest a group Shark + Salmon + Lizard. Characters 3 and 4 suggest a group Salmon + Lizard. But characters 1, 2, 3 and 4, suggest two nested groups, one more inclusive than the other ((Shark (Salmon, Lizard)).

Parsimony and steps on a cladogram

All is not well in Figure 3. The characters do not always specify the same groups. For instance, character 13 (fin rays present in the shark and the salmon) suggests that the salmon and the shark are sister-groups relative to the lizard. So, with the characters at hand there are two theories of taxon relationship. These are shown in Figure 4. In alternative 1 the shark and salmon are sister-groups evidenced by the common possession of character 13. However, if we accept this we have to assume that characters 3 and 4 were either gained twice (once in the salmon and once in the lizard) or that they were gained in the common ancestor of shark+salmon+lizard and subsequently lost in the lizard.

Alternative 2 is that the lizard and salmon are sister-groups, evidenced by the common possession of characters 3 and 4, and we have to assume that character 13 was either gained independently in the salmon and the shark or gained in the common ancestor of shark+salmon+lizard and subsequently lost in the lizard. In other words alternative 1 is more costly in terms of the number of assumptions that we have to make about character evolution. In cladistic analysis, if given no more information, we choose alternative 2 because it assumes the least (or to turn it on its head – it explains the most in the minimum way). Alternative 2 is the more parsimonious solution and therefore is to be preferred. OK – I can hear the cries “but fin rays are more important than large dermal bones, maxilla and dentary.” Maybe, but that is another argument and one that is usually the source of multitudes of disputes. Cladists use parsimony to choose between alternatives because parsimony is a universal rule – it can be applied everywhere in the same way. It does not mean that evolution has followed the most parsimonious course. You do not have to accept the most parsimonious solution, you just have to explain why you do not!

Figure 4. Parsimony. The theory to the right explains the most and assumes the least, and is to be preferred. See text for discussion.

We can think of this in a slightly different way that is revealed in the computer programs used by cladists. In Figure 5 there are four taxa displaying states for six characters and this is displayed in the taxon by character data matrix at the top (data matrices are the daily currency of cladistics). Just for now let us assume that empty cells mean absence of something and that absence is plesiomorphic. Taxon A has none of the attributes. It is wholly plesiomorphic with respect to B, C and D. Taxa B, C and D have various complements of the other characters. Given this information there are three ways in which Taxa B, C and D can be interrelated, and these are shown in the top line of cladograms. The individual characters can be placed on each of the cladograms according to the groups that they specify. For instance character 1 specifies a group B+C+D and therefore will be placed on all cladograms just once. Characters 2 and 4 are autapomorphies and therefore they too will fi t to all cladograms just once (note that these two characters do not help resolve any relationships and some people will ignore them). Characters 3 and 5 specify a group C+D and therefore will be placed on the cladogram to the left once. On this cladogram characters 3 and 5 are said to congruent; they fi t the tree perfectly. On the other two trees characters 3 and 5 are said to be homoplasious because they do not fi t the tree perfectly; two occurrences are needed to explain their distribution. When all characters are fitted on to the cladogram on the left then all but character 6 appears once. If we simply count up the number of times characters appear this equals seven. This cladogram is said to be seven steps long because it requires seven transformations of the characters to explain their distribution in the most parsimonious way (computer programs report the length of the cladogram and authors always give this). If all characters are fi tted to all three cladograms then we will see the centre cladogram and the one to the right are longer (nine and eight steps respectively). In other words the cladogram to the left is the most parsimonious – often called the optimal cladogram. The others are suboptimal.

Figure 5.

Figure 5. Optimising characters on to alternative cladograms. See text for discussion.

Notice at this stage that we have made no evaluation of HOW the characters have fi t the cladogram. For characters 1, 2 and 4 there is no argument, they all fi t once and that is that. Take a look at character 6 on the cladogram to the left (the optimal cladogram). It specifi es a group B+C that does not appear in this cladogram (this group appears in the right-hand cladogram). In the optimal cladogram the character has been assumed to have arisen in B and separately in C; parallel origination has been assumed. It has shown two steps, both gains (absence ? presence). However, we may have assumed that character 6 has been gained by B, C and D and then subsequently lost in taxon D; this is a gain and a loss (absence ? presence ? absence) but still records two steps on the tree. As far as parsimony is concerned there is no difference and we cannot distinguish the two scenarios. We may, however, have beliefs outside of cladistics that lead us to favour one of these transformations over the other. For example, some mammalian palaeontologists believe that the origination of a particular cusp pattern may be more closely related to diet rather than genealogy, therefore parallelism is to be preferred to gain plus loss. On the other hand most palaeontologists would assume that complex structures such as legs are unlikely to have been developed more than once and that the absence in snakes is a loss that followed a gain. Notice that these are not cladistic arguments.

Notice at this stage that we have made no evaluation of HOW the characters have fi t the cladogram. For characters 1, 2 and 4 there is no argument, they all fi t once and that is that. Take a look at character 6 on the cladogram to the left (the optimal cladogram). It specifi es a group B+C that does not appear in this cladogram (this group appears in the right-hand cladogram). In the optimal cladogram the character has been assumed to have arisen in B and separately in C; parallel origination has been assumed. It has shown two steps, both gains (absence ? presence). However, we may have assumed that character 6 has been gained by B, C and D and then subsequently lost in taxon D; this is a gain and a loss (absence ? presence ? absence) but still records two steps on the tree. As far as parsimony is concerned there is no difference and we cannot distinguish the two scenarios. We may, however, have beliefs outside of cladistics that lead us to favour one of these transformations over the other. For example, some mammalian palaeontologists believe that the origination of a particular cusp pattern may be more closely related to diet rather than genealogy, therefore parallelism is to be preferred to gain plus loss. On the other hand most palaeontologists would assume that complex structures such as legs are unlikely to have been developed more than once and that the absence in snakes is a loss that followed a gain. Notice that these are not cladistic arguments.

Consensus

It sometimes happens that having been through the exercise in Figure 5 we arrive at a solution where there are more than one optimal cladograms: that is, two or more cladograms are of equal length. We have several choices at this stage: we could add more characters to try and resolve the problem, we could choose one of the cladograms because it fits the stratigraphic record better, or a palaeobiogeographic theory more comfortably, or simply because it satisfies our preconceptions. Another is to summarise the information that is common to them all and this is done through the use of consensus trees. We will devote a few paragraphs to these later.

Cladogram/tree terminology

At this point a pause may be in order to deal with some nomenclatural housekeeping and conventions. I have used the word cladogram up till now but Hennig used phylogenetic tree and technically I should have done the same in the initial descriptions in this introduction. There is an important distinction between a cladogram and a tree that we will come on to. Unfortunately the literature nearly always uses the word “tree” – tree length, tree shape, optimal and suboptimal trees, consensus trees etc. This is because “tree” is a mathematical term and much of the computing side of cladistics is basically maths. It is usually obvious when tree and cladogram are implied.

There are also some terms used to specify parts of the cladogram/tree; these are given in Figure 6 and most are self explanatory. The ingroup is made up of the taxa whose interrelationships you are interested in. The outgroup is technically the rest of life, but is usually one or more taxa that preconceived ideas hold to be closely related to the ingroup. As we will see later, the outgroup is important because it determines the plesiomorphic/apomorphic states of the characters of the ingroup.

Figure 6.

Figure 6. Terminology applied to parts of the cladogram/tree.

Sometimes cladograms/trees are drawn such that each of the branches leading to the terminal taxa and each of the internodes is of equal length, irrespective of how many character changes may be assigned to parts of the tree. This is called a non-metric tree. Another description is a metric tree in which the relative lengths of the branches and internodes are drawn to reflect graphically the numbers of character changes which may be different in different parts of the tree. The results of molecular analyses are often depicted as metric trees to emphasise the great variation in numbers of character changes that often occur in different parts of the tree.

Types of groups

Hennig identified three types of groups, which he recognised on the basis of ancestry and descent. These are shown In Figure 7.

  1. A monophyletic group contains the most recent common ancestor plus all and only all descendants. In this figure such groups (with their Linnean names) would be ancestor ‘Z’ and Salmon+Lizard [Z(BC)] – named Osteichthyes; ancestor ‘Y’ and Shark+Salmon+Lizard [Y(ABC)] – named Gnathostomata; or [Z(DABC)] – named Vertebrata.
  2. A polyphyletic group is one defined on the basis of convergence, or on non-homologous characters assumed to have been absent in the latest common ancestor (X). A group D+B containing only the lamprey and the salmon, which might be recognised on the shared ability to breed in freshwater, would be considered a polyphyletic group. Breeding in freshwater in vertebrates might be considered to be an apomorphic character but this is inferred to have arisen on more than one occasion. The character by which we might recognise it is non-homologous, it is a false guide to relationship. No Linnean taxon has ever been recognised for this group.
  3. A paraphyletic group is a group remaining after one or more parts of a monophyletic group have been removed. Assuming the truth of the shape of the cladogram in Figure 7 then the group shark+salmon (A+B) is a paraphyletic group that has been traditionally recognised as Class Pisces (fi shes). However, one of the included members (B – the salmon) is inferred to be genealogically closer to C the lizard, which is not recognised as part of the group Pisces. The shark and salmon share an ancestor (Z) but not all descendants of that ancestor are included in the group.
Figure 7.

Figure 7. The types of cladistic groups recognised by genealogy with Linnean names applied.

Most systematists would agree with the desirability of recognising monophyletic groups, and they would also accept the artificiality of polyphyletic groups. It is paraphyletic groups which have been the source of debate, particularly among palaeontologists, because ancestral groups are, by definition, paraphyletic (Pisces ancestral to Tetrapoda, Reptilia ancestral to Aves and Mammalia).

The ‘defining attributes’ of paraphyletic groups, such as Pisces, are symplesiomorphies: that is, they are attributes of a more inclusive group. In Figure 3 the group Salmon+Shark shares a common ancestor Y recognised by the possession of characters 1 and 2. But these are characters of the group (salmon+shark+lizard). The group salmon+shark does not have any unique characters: in fact it can only be recognised by stating BOTH what it has (characters 1 and 2) AND what it does not have (characters 5 – 9). Since the attributes that the group salmon+shark shares with the lizard are not unique to it, then it can only be unique in what it does not have (characters 5 – 9). Unfortunately most of life also does not have characters 5 – 9. Lest you think that this is only metaphysical solipsism, the bottom line is that it is diffi cult to know if a newly discovered fossil is a member of an ancestral group if there are no identifi able characters by which to identify membership. And this is compounded by the fact that if we have to identify what it does NOT have, then we have to be sure that the absence is not preservational. Yet, much of the palaeontological literature is swollen with arguments over whether X or Y is the ancestor.

Another reason why paraphyletic groups have been popular in the past is that it was thought that information about evolutionary divergence could be conveyed. To recognise a paraphyletic group Pisces is also to recognise the collateral group Tetrapoda. This is done to emphasise the many autapomorphies of this latter group. In terminology of evolutionary taxonomy these tetrapod characters were seen as evidence that tetrapods had shifted to a new adaptive zone (involving life on land, receiving stimuli through air rather than water etc.). In a cladistic classifi cation, such divergence would be expressed through the number of autapomorphies identifi able in tetrapods.

Up to this point the types of groups have been described as Hennig did – in terms of common ancestry. But groups are not discovered in this way – they are discovered through character distribution. So, we must return to characters to look again at the defi nition of groups. In Figure 3 the monophyletic groups salmon+lizard and shark+salmon+lizard are each recognised by synapomorphies (characters 3 and 4, and characters 1 and 2 respectively). They can be thought of as evolutionary homologies. So homology is equivalent to synapomorphy and monophyletic groups are discovered through the discovery of synapomorphies. [But notice that synapomorphies can be shown to be false if character congruence suggests a different grouping: therefore an homology is a theory that may be shown to be false – more about this in the next article.]

Figure 8.

Figure 8. Types of groups recognised by character distributions.

Paraphyletic and polyphyletic groups are recognised by the distribution of characters. Paraphyletic groups are those groups recognised by symplesiomorphies: that is, characters useful at a more inclusive level in the hierarchy. Polyphyletic groups are recognised by homoplasious distributions of characters. These relations are shown in Figure 8.

Figure 9.

Figure 9. Cladograms and trees. The five trees shown at the bottom have an implied time axis.

Cladograms and trees

Throughout this introduction we have been slowly shifting away from Hennig’s evolutionary explanations for concepts of relationship, characters and groups. We do not discover the relationships between taxa by discovering ‘evolution’. We can only use the distribution of characters (and just to appease some of the hard-line palaeontologists there are some cladists who recognise stratigraphic occurrence as a character – more later). The relationships illustrated in Figure 3 can be written as a branching diagram (upright, on its side or upside down) as at the top of Figure 9. But this diagram could just as easily be illustrated as a Venn diagram, or be written in parenthetical notation as shown in the top half of Figure 9. A cladogram has no implied time axis. It is a diagram that summarises a pattern of character distribution. The nodes of the branching diagram denote a hierarchy of synapomorphies. There is no implication of ancestry and descent.

Given the character information contained in this Venn diagram there are, however, a number of equivalent evolutionary trees which include time, and which embody the concepts of ancestry and descent with modification. Five such are shown below. Some of these trees assume that one or other of the taxa (A, B, C, D) are real ancestors. Other trees include hypothetical ancestors (x, y, z). Only one tree has the same topology as the cladogram, and this is the one in which the nodes represent hypothetical ancestors. The others contain one or more real ancestors. Choice between these trees depends on factors other than the distribution of characters over the sampled taxa, the only empirical content. Selection of one tree in preference to any other may depend on our willingness to regard one taxon as ancestral to others. Alternatively, we might say that some trees containing real ancestors are less likely to be true than others because of unfavourable stratigraphic sequence. The important point is that evolutionary trees are very precise statements of singular history, but their precision is gained from criteria other than character distributions; and these trees cannot be justified on characters alone.

The distinction between cladograms and trees is important because many people have taken the cladogram as a statement about evolution. To do this we must be prepared to accept other beliefs: for instance, that evolution is parsimonious, or that evolution proceeds exclusively by branching. Many of the criticisms of cladistics are levelled at the claim that these are unrealistic assumptions of evolution. Indeed they are. But they are not assumptions of cladistics or cladograms. They are assumptions of trees. The cladogram, as a distribution of characters, is the starting point for further analysis. Many systematists do in practice turn their cladograms into trees in order to say something about evolution. And some cladists do recognise ancestors after the analysis (more later).

Futher Reading

There are many, many books available. The following are relatively short and the most compatible with this series of articles.

Ax, P. 1987. The phylogenetic system: the systematization of organisms on the basis of their phylogenesis. John Wiley & Sons. ISBN 047 1907545. [A weighty text book but one that clearly explains phylogenetic systematics from a Hennigian perspective.]

Hall, B. G. 2001. Phylogenetic trees made easy. Sunderland Massachusetts, Sinauer Associates. ISBN 0 87893. [A textbook stressing molecular phylogenetic systematics, but it comes from the same stable as the PAUP computer program and contains clear instructions on what the program is doing – and why.]

Kitching, I. J., Forey, P. L., Humphries, C. J. and Williams, D.M. 1998. Cladistics. 2nd edition. Oxford, Oxford University Press. ISBN 0 19 850138 2. [A textbook dealing with the theory behind cladistics and parsimony analysis.]

Schuh, R. T. 2000. Biological systematics: principles and applications. Phylogenetic analysis of morphological data. Ithaca: Cornell University Press. ISBN 0 8014 3675 3. [A hard line parsimony approach, up to date and easy to read.]

Skelton, P., Smith, A. and Monks, N. 2002. Cladistics: a practical primer on Cd-Rom. Cambridge, Cambridge University Press. ISBN 0 521 52341. Pp. 1–80 [A nice easy book with tied in computer practical.]

Smith, A.B. 1994. Systematics and the fossil record: documenting evolutionary patterns. Oxford: Blackwell Science. ISBN 0632036427. [A very well known and well used book. It contains several sections directly relevant to the more theoretical side of this series of articles and is about the only one dedicated to the palaeontological viewpoint.]



Created by Alan R.T. Spencer on the 2014-12-09. (Version 2.0)