Monday, September 19, 2016

The ancestors are not among us

>>Terms like 'basal', 'early-diverging', and 'first-branching' reflect persistent misconceptions about evolution and phylogenies

Why take the time to blog about the issues with the use of the word "basal" and similar terms?

This a good question because, indeed, many tree-thinking papers have directly discussed misconceptions related to interpreting phylogenies (e.g., Omland et al. and Meir et al.).  In fact, several have tackled issues surrounding the term "basal" specifically (see Krell and Cranston and Crisp and Cook). Given these efforts, I hoped that the use of this term (and the associated misconceptions) would begin to erode.  Unfortunately I feel as though, if anything, the problem is becoming more widespread.  I think this is largely for several good reasons -- building phylogenies is continually becoming easier, even for large datasets, and many researchers from a range of fields are seeking to incorporate an evolutionary perspective into their research.  However, it's important to point out that the misinterpretation of phylogenies is just as common in evolutionary biology as in any other field.  So in short, the “basal” problem is not going to disappear without active efforts to teach tree-thinking to all biologists at all stages.

So what is the problem with 'basal', exactly?

The problem is that the term is used incorrectly and/or in misleading ways in talks, papers, and proposals, roughly 90% of the time (by my estimate).  Moreover, the use of basal and similar terms perpetuates a large suite of misconceptions about how evolution works.  So in order to communicate effectively and accurately about evolution, we must also communicate effectively and accurately about trees.  As I have struggled to understand the desire to describe some taxa as basal (or early-diverging or early-branching), I've assembled a mental list of the various ideas that speakers and writers seem to be aiming to communicate with the use of these terms.  I've listed these below along with comments about relevant misconceptions.  This list overlaps with what has been described in the publications listed above.

1) A basal species is one that has given rise to another species, i.e. some ancestral lineage. 

I think this use of the term is related to the misconception that some living taxa are the ancestors of other living taxa.  Unless something has changed with respect to the space-time continuum, this is not possible.  The ancestors are lineages that are no longer present -- they are represented in the tree as internal nodes and internal branches.

2) A basal taxon is one that is older than other taxa in the tree.

If all of the taxa at the tips are extant (i.e., not extinct), then all of them are the same age.  They all have the same root-to-tip distance in terms of time. In other words, they have all evolved the same amount of time from the base of the tree. (Note that this is a correct use of base -- the base is the earliest part of the tree, the root, and time proceeds forward from that point.)  It is worth noting that in molecular phylogenies, some tips may be longer or shorter (i.e., the tree may not be ultrametric).  This is due to a combination of stochasticity in the substitution process as well as differences in substitution rates across lineages. However, we would not say that the taxa on longer branches are "more evolved" than the other taxa on the tree.

3) A basal taxon is primitive morphologically or in some other sense.

All species, extant or extinct, possess a mixture of characteristics that are, in cladistic terminology, ancestral (plesiomorphic) or derived (apomorphic) relative to other species.  For example, in reptiles having scales and four limbs is the ancestral state. Snakes have retained the ancestral state of having scales but have the derived state of no limbs.  This highlights the fact that no species can be accurately described in evolutionary terms as 'primitive', 'ancestral', 'lower', or 'basal' any more than they can be described as 'derived', 'advanced', or 'higher'. In this sense, the tree-thinking view of species diversity is rather egalitarian.  All of the species on earth have evolved the same amount of time from the last common ancestor of all life some 3 to 4 billion years ago, and their diverse forms reflect the accumulation of changes during their unique pathways along the tree to the present. It's worth noting that this misconception (that not all species are equally 'evolved', or equally 'advanced') has been linked to the history of progressive ideas in evolution, and specifically, the notion that humans sit at the apex of a ladder of life. This is exactly why terms like basal are more than just poor wording; they perpetuate the incorrect interpretation of phylogenies as ladders of progress. We can't expect to improve understanding of the tree-like nature of evolution while continuing to use misleading terminology.

4) Basal lineages sit at the base of the tree or at the bottom of the tree diagram.

The observation that certain lineages are near the bottom of a tree does not reflect any aspect of evolutionary history; it is simply a reflection of the choices made in drawing the tree. These choices are generally guided by aesthetic and didactic motivations.  That is, the tree is drawn to best communicate the results of the phylogenetic analysis in a visually appealing way. The root could be towards the top or towards the bottom, and the authors can rotate trees at nodes and bend branches.  None of these drawing choices alters the relationships depicted in the tree.  Thus, the two trees below communicate the same phylogenetic information (e.g., lizards are more closely related to humans that to frogs), despite the fact that the nodes have been rotated.  This exercise makes it apparent why you cannot learn interpret a phylogeny from the order of the tips, only from the order of the nodes.  For more practice in reading trees without being distracted by tip order or tree format, look here and here.

You may be wondering, so if the order across the tips can be rotated without changing the tree, how do authors choose among possible rotated versions? Since we read from left to right, it is common to show the 'focal' taxon toward the right and if humans are in the tree, we are almost always put in that prime position! Just take a look at most biology textbook depictions of primate phylogeny, like this one.

Some of these drawing choices may also not be apparent to novice users of phylogenetics software. Most inference software and tree drawing programs will automatically 'ladderize' trees, which places the sparsely sampled outgroups on the left of an upright tree as above or on the top or bottom of a horizontal tree.  Thus, the order of the tips that a program produces is drawing convention and never the outcome of an analysis (e.g., you could never say that 'the maximum likelihood analysis placed lemurs at the base of the primate phylogeny').

So what should I say, if not basal?

I realize that this whole post may be a big bummer.  Especially if what you wanted from a phylogeny was to learn which species were ancestral or which diverged first (reminder, neither are possible or realistic, see above and below). Once you have reconciled yourself with the fact that the ancestors are not among us, here are some 100% not-confusing, evolutionarily-consistent, and still interesting things you can say about a tree like the Platanthera orchid phylogeny below from this review paper.

  • The ancestors of extant Platanthera had nocturnally pollinated flowers. [Not, e.g., 'Nocturnally pollinated Platanthera are ancestral.'  Because this could connote that some living Platanthera are ancestors of others.  See space-time-continuum above]
  • Diurnal pollination has arisen multiple times in Platanthera. [Not, e.g., 'Diurnally pollinated Platanhera are evolutionarily derived.' Because character states can be derived but taxa cannot be]
  • The sonoharae-fuscescens-ussuriensis-japonica clade is sister to the rest of the genus Platanthera. [Not, e.g., 'The sonoharae-fuscescens-ussuriensis-japonica clade is basal to the rest of Platanthera.' or 'The sonoharae-fuscescens-ussuriensis-japonica group is an early diverging clade of Platanthera.'  More on early-diverging below]
  • The evolution of white flowers evolved after the transition to diurnal pollination in the clades that includes. P.  blephariglottis and P. nivea.
  • The basal nodes of the tree are reconstructed as nocturnally pollinated. [It is fine to describe the earliest nodes in the tree as basal because they did in fact occur earlier than the nodes towards the tips.  Although personally, I prefer 'deeper' nodes.] 

Notice that in this list, I use basal to refer to nodes, sisters to refer to taxa, and ancestral/derived to refer to characters.  My intuition is that many who use basal to refer to taxa are actually most interested in characters (what did the ancestor look like, not who was it exactly).  So if this is the case, talk about the characters! This is a good strategy for avoiding tree mis-speak.

How about calling some taxa 'early-diverging' or 'first-branching' instead of 'basal' or 'lower'?  Doesn't this do the trick?

In short, no. These terms are equally uninformative, and since many listeners will equate them with basal, equally misleading.  I'll walk through an example that may help to reveal why early-diverging is not meaningful.

Let's first consider a very simple tree with just two tips.

We would look at this tree and say there are two tips that have diverged from a common ancestor and have evolved for the same period of time since that split.  We can add another taxon to this tree.

It is still true in this tree that the turtle and the human have evolved the same amount of time since the earliest node in the tree.  And so has the bird, remembering that its ancestry includes the branch shared with the turtle.  Looking at this sort of topology, there is a tendency to call the branch labeled human as 'early diverging', but of course the bird-turtle branch diverged at the same moment.  Thus, this term seems to instead to be applied to whichever branch has given rise to fewer descendants given the taxon sampling. Taxon sampling, like tree drawing, is a choice, and I could instead have chosen a different set of three taxa, e.g.,

Calling the turtle an early-diverging amniote based on this tree is just as odd as calling the human an early-diverging amniote based on the previous tree.  Even if we had all of the amniotes sampled in this tree, one of the two branches arising from the root node would be less speciose than the other. Tree drawing software will typically ladderize the tree and thus would place the less speciose clade at the bottom of the figure.  This placement does not making it earlier-diverging -- just as with any pair of sister taxa, the two descendent branches emerged from the node at the same time and have evolved for the same time to reach the present. This is not affected by how many times those branches have split.

I have also seen a tendency for some to say, well, what I mean by early-diverging or basal is indeed species-poor sister group, so as long as that's clear to the audience, I will continue to use this term.  My first response is that pointing out that one group is species-poor is not an interesting observation by itself.  All real trees are unbalanced and one of the two branches descending from the root will almost always have more descendants. Second, I think this choice of wording is simply too dangerous.  Using terms like basal and early-diverging carry strong connotations, and most audiences will assume taxa carrying this label have retained more primitive characters or will fall into one of the many misconceptions listed above. 

Looking forward

So what do I hope to result from this long exposition on tree terminology?  Most of all to have convinced you that it's not about the terms, it's about the ideas.  Evolutionary biologists, me included, spend tremendous energy to learn about the history of life -- when groups diverged from each other, what changes occurred along the different branches, what factors may have caused these changes.  But this effort is wasted if sloppy terminology allows the inferred history to be misconstrued as a ladder of progress, or yet another living fossil. I believe that we don't need such familiar and comfortable storytelling to make evolution interesting or relevant, to our peers or to the general public.  The truth is that all living taxa have traversed fascinating paths to reach the present and all of their stories are worth telling.

**I'd like to acknowledge advice from Emily Sessa, Brian O'Meara, and Eric Schranz on this post, as well as helpful comments from Matt Hahn.