Abstract
AbstractAncestral reconstruction is a widely-used technique that has been applied to understand the history of gain and loss of gene families over evolutionary time scales, and to produce hypotheses on how these gains and losses may have influenced the evolutionary trajectories of extant organisms.Ancestral gene content can be reconstructed via different phylogenetic methods, such as maximum likelihood or Bayesian inference, but many current and previous studies employ Dollo parsimony. We hypothesize that Dollo parsimony is not appropriate for ancestral gene content reconstruction inferences based on sequence homology, as Dollo parsimony is derived from the assumption that a complex character can only be gained once in evolutionary history. This premise does not accurately model molecular sequence evolution, in which false orthology can result from sequence convergence (including random sequence similarity and parallel gene gains), non-orthologous homology or lateral gene transfer. The aim of this study is to test Dollo parsimony’s suitability for ancestral gene content reconstruction and to compare its inferences with a maximum likelihood-based approach which allows a gene family to be gained more than once within a tree.In order to test our hypothesis, we first compared the performance of the two approaches on a series of artificial datasets each of 5,000 genes that were simulated according to a spectrum of evolutionary rates. Next, we reconstructed protein domain evolution on a phylogeny representing known eukaryotic diversity. We observed that Dollo parsimony produced numerous ancestral gene content overestimations, which were more pronounced at nodes closer to the root of the tree. These observations led us to the conclusion that, confirming our hypothesis, Dollo parsimony is not an appropriate method for ancestral reconstruction studies based on sequence homology.
Publisher
Cold Spring Harbor Laboratory