Skip to content Skip to navigation

Cladistics for Palaeontologists: Part 2 - Cladistic Characters

Article from: Newsletter No. 61
Written by:
PDF: No article PDF

2. Cladistic Characters

Any variation between individuals and taxa may be considered as characters to be used in reconstructing phylogeny. Such variation may be morphological, physiological, behavioural, ecological or molecular. For present, palaeontological purposes I will stress morphological variation as characters. Some palaeontologists, who are also cladists, additionally use stratigraphic variation of taxa as characters in order to reconstruct phylogeny: this is a more contentious issue to which I will return in another article. Stratigraphy has also been used to choose between equally parsimonious trees, as well as to root the tree, but since these are activities that we do after cladogram/tree construction I will leave these issues until later.

The delimitation of characters – how they are coded to be used in phylogenetic analysis – is not the sole province of cladistics: it extends into evolutionary taxonomy and phenetics (numerical taxonomy). However, the parsimony algorithms used in computer-generated cladograms do impose certain constraints on how we code variation, and how we interpret the consequences of particular ways of coding.

The main point to be made in this article is that the selection and coding of characters is the key stage of cladistic analysis. The way we code characters can influence the phylogenetic hypothesis. Once the data matrix of taxa against character codes is constructed, the analysis is done. That which flows from the matrix is maths (and maybe some special pleading). It is the construction of the data matrix that is the key advantage of cladistic analysis because it forces the investigator to think about how the variation is to be described; it forces the investigator to look for variation in all taxa, to be precise in the translation of variation to codes for analysis, and to understand the consequences of their actions. Here endeth the sermon!

The subject of characters – even as to what constitutes a character – has been discussed extensively and has generated a vast literature. In part this is because it is characters that enable us to recognise groups (taxa), so the idea of a character is intimately tied with the most fundamental concept in biology – homology.

Here are some definitions of characters:

  • “A character in systematics may be defined as any feature which may be used to distinguish one taxon from another.” (Mayr et al. 1953)
  • “A character is a feature of an organism that can be evaluated as a variable with two or more mutually exclusive and ordered states” (Pimentel & Riggins 1987)
  • “A character is a theory that two attributes which appear different in some way are nevertheless the same.” (Platnick 1979)

These ideas can be illustrated as in Figure 1 that illustrates the structure of a crustacean appendage. In this figure a crustacean maxilla is shown to the left, and a amphiopod maxilla to the right.

Figure 1.

Figure 1. Same but different. These two structures are deemed to be the same even though they look different. They are the ‘same’ because they are formed on the same serial segment of the head, formed of the same tissues etc. They are different in their shape and complexity. Therefore they can be thought to be part of the same character, but to show different states (e.g. endopodite smaller/larger than scaphopgnite etc).

So, characters concern identity or otherwise of the observation as well as the notion of homology (theory of sameness). There are two stages in this argumentation. The first is to suggest that each of the observations is somehow the same and each allows us to recognise a group; the second is to put that theory to the test. Some people speak of primary and secondary homology. Primary homology is the initial assessment of identity – is it the same thing? Similar topographic position, similar ontogeny and similar histological structure may be factors helping us make this decision. Phylogenetic analysis allows us to recognise secondary homology and establish characters of groups. For most people the delimitations of characters and states of characters is the domain of primary homology, and for all investigators it is the first thing that is done. Therefore we will concentrate on this.

Nearly all modern cladistic analyses are computer-driven, simply because of the scale of the problem (see next article on tree building). Any morphological variation is translated to discrete integers or symbols (1, 2, 3, 4 etc or a, b, c, d etc). This differs from phenetics (numerical taxonomy) where continuous variables can be accepted in raw form (1.342, 1.784, 2.345 etc). Morphological variation can be described under two contrasting forms – qualitative or quantitative; discrete or continuous. You can, of course, have a character in which the states can be both discrete and quantitative.

Qualitative variation may be exemplified by ‘leaves round’ and ‘leaves ovate’, whereas a quantitative character may be ‘eye comprising at least half head length etc’. We are naturally inclined to prefer the first kind over the second as being more clear cut. However, many so-called qualitative characters are, in fact, quantitative, and in reality we have placed some observational filter to convert quantitative into qualitative. For example, amongst those taxa deemed to have round leaves there may well be some where not all diameters of a leaf are equal (round) and may be more appropriately described a nearly ovate. In other words there may be some overlap in attribute features. Morphometric techniques may help here in being able to describe shapes mathematically, and give numerical justifications for separating round from ovate. They can also critically divide up morphological variation into a number of states. An example is given by Macleod (2002).

Another way in which we may describe variation in states of a character is discrete or continuous. Discrete variation is that which may be coded in single integers: presence/absence (0, 1) or two digits, four digits, five digits (0, 1, 2). There can be no logical intermediates. Continuous data is that which has infinite potential numbers. Classic examples are ratios – head length vs. head breadth (biometric data), or numbers of vertebrae (meristic data).

No matter what the kind of data is, we need to evaluate the degree of overlap in the sample that we have in order to separate it to two or more states (Figure 2). Where the individuals showing two or more values are completely separate from one another – disjunct – there is no problem. The difficulty comes when there is overlap. How much overlap are we to allow before we can no longer recognise two (or more) states?

Figure 2.

Figure 2. Morphological variation can be divided into two or more states very easily if there is no overlap (right-hand column). The difficulties occur when there is overlap (left column); some individuals of taxon 1 show feature attributes more commonly associated with individuals of taxon 2.

Discrete variables

These come in two basic forms, binary and multistate. Binary characters (e.g. absence/presence) have just two states, usually coded as ‘0’ and ‘1’. They are relatively unambiguous. Multistate characters have more than two states ‘0’, ‘1’, ‘2’, ‘3’ etc (the PAUP* computer program allows up to 32 states).

For much of our variation there may be obvious and limited ways of coding. But for some variation there are some tantalising choices. These dilemmas usually exist where the variation is linked and conflicting. For example, let us assume that we had a group of vertebrates, some of which had fingers and some did not. Amongst those that had fingers some had one bone in a particular digit and others had two. Let us further assume that some of those with one-boned fingers had claws and some had hooves, and some of those with two-boned fingers had claws and some hooves. Here we are dealing with three variables that may all be ascribed to a single structure – the digit (either the fingers are there or not; if they are there they may be one-boned or two-boned, and they may be clawed or hoofed). I use this example because Jerry Hooker tells me this is a real situation in mammals (I’ve added the total absence – which could be a snake, for instance). Figure 3 shows several ways of coding this variation (there are other ways – see Forey & Kitching 2000). In all but the first method of coding (the pure multistate character) the coding results in the variation being split into more than one character. That’s OK, but remember that in cladistic analysis every column of data contributes to the final hypothesis of relationship – indeed it is expected that each column of data will be independent of any other. We could argue that this is not the case in methods B and C. In the most extreme example (method C) this variation in the digit has been translated into five characters – and more importantly those taxa that do not have fingers have been scored as such five times! It also means that taxon W, that has no toes, is scored the same as taxon V, that has toes but no hooves. Method B is interesting in that those taxa that do not have fingers have been scored as question marks – meaning here “not applicable”. Question marks can be problematical as we will learn later.

Figure 3.

Figure 3. Given the variation in digit structures shown amongst five taxa at top, there are several different ways of coding the variation (three methods are shown here). The method of coding can influence the outcome of phylogenetic analysis because it is assumed that each column of data provides an independent hypothesis of relationship.

Coding methods

Continuous variables

In order to incorporate continuous variables with discrete variables the former are usually recoded as discrete characters. This recoding usually takes the form of some gap-coding method whereby the variation is segmented where there is a gap or low frequency of overlap of observation. Several ways have been devised for doing this and a good review of strengths and weaknesses of various methods is given by Ried & Sidwell (2002). One common way of coping with a broad range of variation in any one variable (e.g. snout length) is to use gap-weighting devised by Thiele (1993). This method uses standard gap coding but imposes a weight as well, meaning that it is going to ‘cost a lot of steps’ to go from the shortest to the longest but much less to pass between adjacent lengths – and the width of the gaps is taken into account (Figure 4).

Continuous variables are almost always coded as multistate characters.

Figure 4.

Figure 4. The gap weighted method of Thiele(1993),used to give codes that reflect not only the value of the attribute but also the distance between adjacent values.(a) frequency distribution curves for six taxa.(b)means or medians (you would normally range standardise the data).(c) the total range is then partitioned to a set number of equal units(in this case ten)and codes given according to the position into which the taxon means fell).

Ordered and unordered characters

Characters may be ordered or unordered in their transformation between states. This choice will only affect multistate characters. In a multistate character with three states 0, 1, 2, then to pass between state ‘0’ and state ‘2’ is going to cost two steps (0 ↔ 1, 1 ↔ 2); that is, it is incremental. And this can happen in any direction. In an unordered character, any transformation between any state costs one step. You may decide to order a character such as limb with two segments (0), three segments (1), four segments (2) if you believe that the evolutionary transformation from two segments to four segments must have gone through the three segment stage. Imposing order will select some trees shorter (more parsimonious) than others (Figure 5).

Figure 5.

Figure 5. When using ordered characters you should be aware that some cladograms will be shorter than others and will be preferentially selected simply because of optimisation of character states. Here are two trees involving four taxa with character states of a single character given at left. The character states can be optimised on to each of these trees (two of 15 possibilities) such that the tree on the left is more parsimonious (fewer steps) (we will deal with the precise way of optimising characters in the next article). The reconstructed node states (ancestral states on trees) are in square brackets.If the character were unordered there would be no difference because any transformation is made with equal cost in numbers of steps. Both trees would be retained as equally parsomonious.

Polarisation of characters

Here the investigator can impose a direction of transformation between states. In other words we can specify, before the analysis, which state is to be regarded as plesiomorphic and which is apomorphic. Referring back to Figure 2 in the first article then taxa are only grouped on the apomorphic state. There have been several criteria used to determine which is the plesiomorphic and which is the apomorphic state. Here are some of the common ones.

Ingroup commonality: the plesiomorphic state is that which is most common in the ingroup (the group of interest). Stratigraphic: the plesiomorphic state is automatically that which occurs in the earlier fossil. Of course, this should be true but it depends on assessments of the quality of the fossil record etc. Biogeography: the state of the character found in members occupying the presumed centre of origin are to be regarded as plesiomorphic with respect to the state found in the members most widely removed from the centre or origin. This assumes that the plesiomorphic species sits tight and more derived species are to be found more distant. The alternative – that the more derived species have literally pushed out the plesiomorphic species – is not allowed. Ontogenetic criterion: the state of the character occurring in earlier growth stages is to be regarded as the plesiomorphic condition. This is often justified in terms of Haeckel’s law of recapitulation but is actually more closely aligned with Von Baer’s law that general characters appear before the special characters (e.g. the egg appears before the neural tube). Outgroup criterion: that character state which occurs in the outgroup taxon is to be regarded as the plesiomorphic condition.

In modern cladistic analyses it is usually only the last that is used to polarise characters, and this is effectively done automatically. If you wish to impose polarity using the other criteria then you would make a hypothetical ancestor that incorporated all the plesiomorphic states and use that as the outgroup (root).

Therefore there are at least two ways (order and polarity) in which we may constrain the behaviour of a character with three or more states (Figure 6). Constraining the possibilities may reduce the number of equally parsimonious trees, but remember that this action requires independent justification. The most obvious situation in which there may be justification is where different ontogenetic stages are linked into a multistate character. The order is given by the ontogenetic trajectory and the polarisation is given by the earliest ontogenetic stage.

Figure 6.

Figure 6. Given three states there are nine possible transformations in an unordered character (left). Imposing an order allows three transformations (centre). Imposing polarity (right) and order allows only one transformation.

Data input

What way the variation is to be divided into characters and character states, then we need to get the information into the PAUP* program. PAUP accepts Nexus files. You can write such a file in a word processor and save it as a text file. You can open up the PAUP program and write a new file directly. The PAUP manual gives you the precise syntax.

By far the easiest way is to use a separate program. If you are Mac-based then the MacClade program is the one to use. For PC users then Nexus Data Editor is the one for you (see Nerd Notes at the end for details). Both of these programs open up a spreadsheet in which you enter taxa names and character numbers and the relevant character codes. Normally these codes will be 0, 1, 2, 3 etc. You will need a code for missing data (usually ‘?’). You can also enter polymorphic codes. For example if a particular taxon had components (individuals, species) some of which showed the ‘0’ state and some showed the ‘1’ state, this could be entered as such. Personally I would shy away from this because the output is difficult to interpret. You can also enter codes for ‘not applicable’ (usually ‘N’ is used). This is useful in the data matrix stage which will be published and conveys to the reader the precise nature of the ambiguity, but computationally these are simply translated to question marks in the tree-building phase.



FOREY, P. and I. J. KITCHING (2000). Experiments in coding multistate characters. In Scotland, R. W. and Pennington, R. T. (eds) Homology and systematics pp. 54–80,London, Taylor and Francis.

MACLEOD, N. (2002). Phylogenetic signals in morphometric data. In MACLEOD, N. and FOREY, P. L. (eds) Morphology, shape and phylogeny. Pp. 100–138. London, Taylor & Francis.

MAYR, E., LINSLEY, E. G. and USINGER, R. L. (1953). Methods and principles of systematic zoology. New York: McGraw-Hill.

PIMENTAL, R. A. and RIGGINS, R. (1987). The nature of cladistic data. Cladistics 3: 201–209.

PLATNICK, N. I. (1979). Philosophy and the transformation of cladistics. Systematic Zoology 28: 537–546.

RIED & SIDWELL (2002). Overlapping variables in botanical systematics. In MACLEOD, N. and FOREY, P. L. (eds) Morphology, shape and phylogeny. Pp. 53–66. London, Taylor & Francis.

THIELE, K. (1993). The holy grail of the perfect character: the cladistic treatment of morphometric data, Cladistics, 9: 275–304.


PATTERSON, C. (1982). Morphological characters and homology. In JOYSEY, K. A. and FRIDAY, A. E. (eds). Problems of Phylogenetic Reconstruction, Systematics Association Special Volume, No. 21. pp. 21–74. London, Academic Press. [A key paper in our understanding of homology and characters.]

WILLIAMS, D. M. (2004). Homologues and homology, phenetics and cladistics: 150 years of progress. In WILLIAMS, D. M. and FOREY, P. L. (eds). Milestones in Systematics. The Systematics Association Special Volume Series 67. pp 191–224. Boca Raton, CRC Press. [An excellent article on the history of views on homology and how it impinges on modern systematics.]

WILLS, M. A. Morphological disparity: a primer. In ADRAIN, J. M., EDGECOMBE, G. D. and LIEBERMAN, B. S. (eds). Fossils, Phylogeny, and Form – an analytical approach. Pp. 55–144. New York: Kluwer Academic. [Although this chapter is about morphological disparity it does contain some clearly written discussion on discrete and continuous characters and it is packed with references for further reading.]

Nerd Notes

Here are a few details on commonly used computer programs.

Nexus Data Editor (NDE), written by Rod Page, is a spreadsheet data editor that interacts with the Windows PC version of PAUP*. It is virtually self explanatory. It allows you to annotate your data with text and pictures. It is free to download at

MacClade 4 is a data editor as well as a tree manipulator. It was written by Wayne and David Maddison. It interacts with the Mac version of PAUP*. It allows you to enter data into a matrix and provides a limited capability for annotating the characters. It is far more powerful than NDE because it also has capabilities for manipulating trees and character optimisation, and to output trees graphically. It comes with a manual. More information at

PAUP* 4.0 Beta, written by David Swofford, is the key parsimony program to construct trees and get data output (although it does other things that molecular systematists like to play around with). It is only available in beta version – no idea when it will be finished. You buy a program and an updater (supplied on the same CD, both of which have to be loaded). It comes in three versions. One is for the PC (Windows or DOS). This is command line driven, which sounds a bit cumbersome but is quite easy to get used to. The Mac version requires either Mac OS 9 or OS X with Classic installed, and is either command line driven or by means of pull down menus. The last version is the Linux version but I know of no one who uses this. It must be purchased (Windows version $85, Mac version $100, Linux version $150 - correct as of 09/12/2014). It is distributed by Sinauer Associates. PAUP* does not come with a manual but one can be downloaded. More information at the PAUP website

PAST (PAleontological STatistics) by Øyvind Hammer, of the Paleontological Museum of the University of Oslo. This may already be familiar to many of you. It does a variety of things, one of which is parsimony analysis. I have tried it and find it OK for small, clean data sets but some of the heuristic addition sequence options found in PAUP* do not appear to be there, and there is a danger of missing alternative and more parsimonious trees. It can be downloaded free with manuals from the PAST website

If you want more information on a variety of phylogenetic programs and associated tree manipulators then try this site:

Author Information

Peter Forey - The Natural History Museum, London, UK (email:

PalAss Go! URL: | Twitter: Share on Twitter | Facebook: Share on Facebook | Google+: Share on Google+