>>Terms like 'basal', 'early-diverging', and 'first-branching'
reflect persistent misconceptions about evolution and phylogenies
Why take the time to
blog about the issues with the use of the word "basal" and similar
terms?
This a good question because, indeed, many tree-thinking
papers have directly discussed misconceptions related to interpreting
phylogenies (e.g., Omland
et al. and Meir
et al.). In fact, several have tackled
issues surrounding the term "basal" specifically (see Krell
and Cranston and Crisp
and Cook). Given these efforts, I hoped that the use of this term (and the
associated misconceptions) would begin to erode. Unfortunately I feel as though, if anything,
the problem is becoming more widespread.
I think this is largely for several good reasons -- building phylogenies
is continually becoming easier, even for large datasets, and many researchers
from a range of fields are seeking to incorporate an evolutionary perspective
into their research. However, it's
important to point out that the misinterpretation of phylogenies is just as
common in evolutionary biology as in any other field. So in short, the “basal” problem is not going
to disappear without active efforts to teach tree-thinking to all biologists at
all stages.
So what is the
problem with 'basal', exactly?
The problem is that the term is used incorrectly and/or in
misleading ways in talks, papers, and proposals, roughly 90% of the time (by my
estimate). Moreover, the use of basal
and similar terms perpetuates a large suite of misconceptions about how
evolution works. So in order to
communicate effectively and accurately about evolution, we must also
communicate effectively and accurately about trees. As I have struggled to understand the desire
to describe some taxa as basal (or early-diverging or early-branching), I've
assembled a mental list of the various ideas that speakers and writers seem to
be aiming to communicate with the use of these terms. I've listed these below along with comments
about relevant misconceptions. This list
overlaps with what has been described in the publications listed above.
1) A basal species is
one that has given rise to another species, i.e. some ancestral lineage.
I think this use of the term is related to the misconception
that some living taxa are the ancestors of other living taxa. Unless something has changed with respect to
the space-time continuum, this is not possible.
The ancestors are lineages that are no longer present -- they are
represented in the tree as internal nodes and internal branches.
2) A basal taxon is one
that is older than other taxa in the tree.
If all of the taxa at the tips are extant (i.e., not
extinct), then all of them are the same age.
They all have the same root-to-tip distance in terms of time. In other
words, they have all evolved the same amount of time from the base of the tree.
(Note that this is a correct use of base -- the base is the earliest part of
the tree, the root, and time proceeds forward from that point.) It is worth noting that in molecular
phylogenies, some tips may be longer or shorter (i.e., the tree may not be ultrametric). This is due to a combination of stochasticity
in the substitution process as well as differences in substitution rates across
lineages. However, we would not say that the taxa on longer branches are
"more evolved" than the other taxa on the tree.
3) A basal taxon is
primitive morphologically or in some other sense.
All species, extant or extinct, possess a mixture of characteristics
that are, in cladistic terminology, ancestral (plesiomorphic) or derived (apomorphic)
relative to other species. For example,
in reptiles having scales and four limbs is the ancestral state. Snakes have
retained the ancestral state of having scales but have the derived state of no
limbs. This highlights the fact that no
species can be accurately described in evolutionary terms as 'primitive',
'ancestral', 'lower', or 'basal' any more than they can be described as
'derived', 'advanced', or 'higher'. In this sense, the tree-thinking view of
species diversity is rather egalitarian.
All of the species on earth have evolved the same amount of time from
the last
common ancestor of all life some 3 to 4 billion years ago, and their
diverse forms reflect the accumulation of changes during their unique pathways
along the tree to the present. It's worth noting that this misconception (that
not all species are equally 'evolved', or equally 'advanced') has been linked
to the history of progressive ideas in evolution, and specifically, the notion
that humans
sit at the apex of a ladder of life. This is exactly why terms like basal
are more than just poor wording; they perpetuate the incorrect interpretation
of phylogenies as ladders of progress. We can't expect to improve
understanding of the tree-like nature of evolution while continuing to use
misleading terminology.
4) Basal lineages sit
at the base of the tree or at the bottom of the tree diagram.
The observation that certain lineages are near the bottom of
a tree does not reflect any aspect of evolutionary history; it is simply a
reflection of the choices made in drawing the tree. These choices are generally
guided by aesthetic and didactic motivations.
That is, the tree is drawn to best communicate the results of the
phylogenetic analysis in a visually appealing way. The root could be towards
the top or towards the bottom, and the authors can rotate trees at nodes and
bend branches. None of these drawing
choices alters the relationships depicted in the tree. Thus, the two trees below communicate the
same phylogenetic information (e.g., lizards are more closely related to humans
that to frogs), despite the fact that the nodes have been rotated. This exercise makes it apparent why you
cannot learn interpret a phylogeny from the order of the tips, only from the
order of the nodes. For more practice in
reading trees without being distracted by tip order or tree format, look here
and here.
You may be wondering, so if the order across the tips can be
rotated without changing the tree, how do authors choose among possible rotated
versions? Since we read from left to right, it is common to show the 'focal'
taxon toward the right and if humans are in the tree, we are almost always put
in that prime position! Just take a look at most biology textbook depictions of
primate phylogeny, like this one.
Some of these drawing choices may also not be apparent to
novice users of phylogenetics software. Most inference software and tree
drawing programs will automatically 'ladderize'
trees, which places the sparsely sampled outgroups on the left of an upright
tree as above or on the top or bottom of a horizontal tree. Thus, the order of the tips that a program
produces is drawing convention and never the outcome of an analysis (e.g., you
could never say that 'the maximum likelihood analysis placed lemurs at the base
of the primate phylogeny').
So what should I say,
if not basal?
I realize that this whole post may be a big bummer. Especially if what you wanted from a
phylogeny was to learn which species were ancestral or which diverged first
(reminder, neither are possible or realistic, see above and below). Once you
have reconciled yourself with the fact that the ancestors are not among us,
here are some 100% not-confusing, evolutionarily-consistent, and still
interesting things you can say about a tree like the Platanthera orchid phylogeny below from this review
paper.
- The ancestors of extant Platanthera had nocturnally pollinated flowers. [Not, e.g., 'Nocturnally pollinated Platanthera are ancestral.' Because this could connote that some living Platanthera are ancestors of others. See space-time-continuum above]
- Diurnal pollination has arisen multiple times in Platanthera. [Not, e.g., 'Diurnally pollinated Platanhera are evolutionarily derived.' Because character states can be derived but taxa cannot be]
- The sonoharae-fuscescens-ussuriensis-japonica clade is sister to the rest of the genus Platanthera. [Not, e.g., 'The sonoharae-fuscescens-ussuriensis-japonica clade is basal to the rest of Platanthera.' or 'The sonoharae-fuscescens-ussuriensis-japonica group is an early diverging clade of Platanthera.' More on early-diverging below]
- The evolution of white flowers evolved after the transition to diurnal pollination in the clades that includes. P. blephariglottis and P. nivea.
- The basal nodes of the tree are reconstructed as nocturnally pollinated. [It is fine to describe the earliest nodes in the tree as basal because they did in fact occur earlier than the nodes towards the tips. Although personally, I prefer 'deeper' nodes.]
Notice that in this list, I use basal to refer to nodes,
sisters to refer to taxa, and ancestral/derived to refer to characters. My intuition is that many who use basal to
refer to taxa are actually most interested in characters (what did the ancestor
look like, not who was it exactly). So
if this is the case, talk about the characters! This is a good strategy for
avoiding tree mis-speak.
How about calling
some taxa 'early-diverging' or 'first-branching' instead of 'basal' or
'lower'? Doesn't this do the trick?
In short, no. These terms are equally uninformative, and
since many listeners will equate them with basal, equally misleading. I'll walk through an example that may help to
reveal why early-diverging is not meaningful.
Let's first consider a very simple tree with just two tips.
We would look at this tree and say there are two tips that
have diverged from a common ancestor and have evolved for the same period of
time since that split. We can add
another taxon to this tree.
It is still true in this tree that the turtle and the human
have evolved the same amount of time since the earliest node in the tree. And so has the bird, remembering that its ancestry
includes the branch shared with the turtle.
Looking at this sort of topology, there is a tendency to call the branch
labeled human as 'early diverging', but of course the bird-turtle branch
diverged at the same moment. Thus, this
term seems to instead to be applied to whichever branch has given rise to fewer
descendants given the taxon sampling. Taxon sampling, like tree drawing, is a
choice, and I could instead have chosen a different set of three taxa, e.g.,
Calling the turtle an early-diverging amniote based on this
tree is just as odd as calling the human an early-diverging amniote based on
the previous tree. Even if we had all of
the amniotes sampled in this tree, one of the two branches arising from the
root node would be less speciose than the other. Tree drawing software will typically
ladderize the tree and thus would place the less speciose clade at the bottom
of the figure. This placement does not
making it earlier-diverging -- just as with any pair of sister taxa, the two
descendent branches emerged from the node at the same time and have evolved for
the same time to reach the present. This is not affected by how many times
those branches have split.
I have also seen a tendency for some to say, well, what I
mean by early-diverging or basal is indeed species-poor sister group, so as
long as that's clear to the audience, I will continue to use this term. My first response is that pointing out that
one group is species-poor is not an interesting observation by itself. All real trees are unbalanced and one of the
two branches descending from the root will almost always have more descendants.
Second, I think this choice of wording is simply too dangerous. Using terms like basal and early-diverging
carry strong connotations, and most audiences will assume taxa carrying this
label have retained more primitive characters or will fall into one of the many
misconceptions listed above.
Looking forward
So what do I hope to result from this long exposition on tree terminology? Most of all to have convinced you that it's not about the terms, it's about the ideas. Evolutionary biologists, me included, spend tremendous energy to learn about the history of life -- when groups diverged from each other, what changes occurred along the different branches, what factors may have caused these changes. But this effort is wasted if sloppy terminology allows the inferred history to be misconstrued as a ladder of progress, or yet another living fossil. I believe that we don't need such familiar and comfortable storytelling to make evolution interesting or relevant, to our peers or to the general public. The truth is that all living taxa have traversed fascinating paths to reach the present and all of their stories are worth telling.
So what do I hope to result from this long exposition on tree terminology? Most of all to have convinced you that it's not about the terms, it's about the ideas. Evolutionary biologists, me included, spend tremendous energy to learn about the history of life -- when groups diverged from each other, what changes occurred along the different branches, what factors may have caused these changes. But this effort is wasted if sloppy terminology allows the inferred history to be misconstrued as a ladder of progress, or yet another living fossil. I believe that we don't need such familiar and comfortable storytelling to make evolution interesting or relevant, to our peers or to the general public. The truth is that all living taxa have traversed fascinating paths to reach the present and all of their stories are worth telling.
**I'd like to acknowledge advice from Emily Sessa, Brian O'Meara, and Eric Schranz on this post, as well as helpful comments from Matt Hahn.