Do the statistics prove a common ancestry?

Photo credit: Anna Dudkova via Unsplash.

Over the past year, various Discovery Institute staff, fellows, and summer seminar alumni have participated in a book club reading articles related to systematics and phylogenetics. One of the articles we read was from Baum et al. (2016), “Statistical evidence of common ancestry: application to primates. The idea behind the paper originated in 2014 when David A. Baum, Cecile Ane and Bret Larget taught a postgraduate seminar at the University of Wisconsin, Madison, where they are all faculty members. the importance of common ancestry to the theory of evolution, their motivation was to retain and convert the evidence supporting common ancestry into quantified statistics. They focused on primates because they believe that this is a central point in the evolution debate.

A number of scientists, including myself, analyzed their methods and had a group discussion regarding their conclusion: “The common ancestry of primates is an extremely well supported hypothesis. “

General thoughts

Our group selected this article because the proposal – to test two alternative models of common ancestry – is both rare in the literature and fascinating. They studied two versions of distinct ancestry, which they defined as “the SA species (the distinct origin of each named species) and the SA family (the distinct origin of each family).” We didn’t agree with the whole document, but appreciated that it was easy to follow, insightful and respectful. Some members of the group also appreciated that the morphological and molecular characters were taken into account. Although not everyone agreed with the conclusion, most thought the document’s statistical methodology for “quantifying” historical science was useful.

Also, before I dig in, I want to note that smart design is not necessarily incompatible with common ancestry. In fact, our book club had supporters of identification with a variety of views on common ancestry – some supportive, some skeptical, and some agnostic.

A separate ancestry model does not represent the ‘other side’ point of view

The biggest problem we saw with the article was that the alternative models for common ancestry – species ancestry and familial ancestry – are not accurate representations of “the other side of the debate. “, that is, the proponents of intelligent design who question common ancestry. Here’s why.

In essence, their comparisons asked whether the similarities between the organisms that form the basis of the phylogenetic comparisons might have arisen. Luckily or of common ancestry. If common ancestry was a more likely explanation than luck, then they concluded that common ancestry was supported. But, no one is suggesting that chance would produce the similarities. For the ID promoter who questions common ancestry, similarities would be produced from conception. Even the authors noted that such a test is biased in favor of common ancestry:

Most of the statistical tests we are discussing are epistemologically asymmetric. They involve identifying a pattern that is expected under CA [common ancestry] then quantify the probability that the observed data could have occurred by chance under SA [separate ancestry].

What came out of our discussion was that this model of separate ancestry is not endorsed by anyone in the ID community and, frankly speaking, is extremely unrealistic for all kinds of biological reasons. For example, this distinct ancestry pattern would have come as a big surprise to Carl Linnaeus, Georges Cuvier or Luis Agassiz, who organized taxonomic groups around shared similarities, without any causal requirement necessary for evolutionary descent.

Their conclusion – “We overwhelmingly rejected species and the SA family with infinitesimal P values” – is not surprising and represents a non-test of an actual SA model. Accordingly, this conclusion presents no challenge to proponents of identification who question common ancestry, and the discussion could be best summarized as “talking to each other”. One of the main takeaways for proponents of identification is that they need to clarify what their distinct ancestry model really is, so that others can test it. I’m going to try to do a little of that, conceptually, now.

What can the similarity tell us about the story? Tip: not much

The authors hypothesize that similarities between primates involve historical relationships. “[O]organizations [that] share similarities, especially similarities that would be very unlikely to occur independently, provide… evidence in favor of CA (Sober and Steel 2015). This assumption is standard for historical phylogenetic and led the authors to design the species’ distinct ancestry model with all the similarities occurring independently: “A key feature of the species SA model is that for each trait, the state drawn by each species is independent of that drawn by other species. ”

This model and assumption are very problematic from the perspective of the ID. Design can cause striking similarities, without any historical or evolutionary relationship.

Consider a scenario where there are three German Shepherds: a mother, her son, and a third who is a genetically modified clone of the son, born in a laboratory womb. The genetically engineered German Shepherd in this imaginary scenario is genetically identical to the real son and phenotypically similar, but has no historical relationship to the mother. Instead, it is a product of human genetic engineering.

I describe this scenario, because if there are mechanisms beyond historical relationships that could explain genetic similarity, i.e. genetic engineering, then it is no longer possible to assume that similarity must infer a historical kinship. Although in this case the existence of the mother is necessary for the clone, it is not enough to explain the existence of the clone or its similarities. It would be incorrect to describe the Third German Shepherd as the historical descendant of the mother, much like Craig Venter’s Syn3.0 cell, which is based on a Mycoplasma strain, would not exist without the careful design of molecular biologists and human geneticists.

Thus, the hypothesis that ancestry is the only mechanism or the best explanation for the similarity of traits is not retained by the promoter of the ID. Instead, proponents of the ID argue that a designer can produce similarities, just as different Gucci handbags have similarities. A more technical explanation of an identification pattern for a distinct ancestry can be found here.

Cherry picking?

Before doing statistics, the authors selected the characters (genes and morphological characteristics) that they would use for their analysis. How did they choose? For the molecular data set, this is not clearly stated in their article. But reading elsewhere (Perelman et al. 2011; Murphy et al. 2001), we found that primers from earlier studies were being used as well as new ones. By checking the first quote, we found the following.

We examined the sequence variation in 18 homologous gene segments (including nearly 10,000 base pairs) that were selected for maximum phylogenetic information in the resolution of the hierarchy of the divergence of the first mammals.

Perelman et al. 2011; Murphy et al. 2001

Why are these sequences phylogenetically informative? Because they resolve the hierarchy of early mammalian divergence in the way the model anticipates. Several participants raised issues with this approach during the discussion. They argued that selecting or excluding data, based on resolving to an expected pattern, stacks up the game. One participant said that if you’re going to use genetic similarity to justify a historical relationship, then you have to take this into account. of all DNA (which poses other problems) and a penalty for misfits who do not fit within the expected framework. In other words, orphan genes should be evaluated against the historical parentage hypothesis.


In summary, the methodology and desire to test alternatives to common ancestry discussed in this article is admirable. However, a number of common assumptions have been made that supporters of a distinct ancestry with a conception perspective would not approve. Namely, the article does not recognize that conception may generate similarities independent of common ancestry. This makes the majority of the article a case of “two-way conversation” while avoiding the real issues of whether a design-based model is actually a better fit for the data. What should happen next? More dialogue between these two camps would be helpful so that appropriate models can be created and then statistical tests applied.


  • Baum, David A., Cécile Ané, Bret Larget, Claudia Solís-Lemus, Lam Si Tung Ho, Peggy Boone, Chloe P. Drummond, Martin Bontrager, Steven J. Hunter and William Saucier. 2016. “Statistical evidence of common ancestry: application to primates”. Evolution.
  • Murphy, WJ, E. Eizirik, WE Johnson, YP Zhang, OA Ryder, and SJ O’Brien. 2001. “Molecular Phylogenetics and the Origins of Placental Mammals.” Nature 409 (6820): 614-18.
  • Perelman, Polina, Warren E. Johnson, Christian Roos, Hector N. Seuánez, Julie E. Horvath, Miguel AM Moreira, Bailey Kessing, et al. 2011. “A Molecular Phylogeny of Living Primates.” PLoS genetics 7 (3): e1001342.

Comments are closed.