Abstract
AbstractMolecular dating is a cornerstone of evolutionary biology, yet it is by far not an exact science. The inference of precise dates using gene sequences is difficult, in part because of the stochastic process of DNA mutation, selective forces that alter substitution rates and many unknown parameters linked to population genetics in ancestral lineages. Dating species divergence is one important challenge in this field, which is usually performed by concatenating extant sequences sampled within a genome as representative of a lineage, and computing distances between these lineages. However, concatenates precludes the dating of events specific to a gene family, such as gene duplication. During evolutionary time, individual gene sequences record different signatures of base substitutions and at rates that may deviate substantially from the average rate. No formal study exists that quantifies which parameters influence this deviation. Here we designed a strategy to date events within a gene family, and we test the influence of more than 30 parameters on dating accuracy. We developed this approach on approximately 5,000 primate gene families comprising 12 genomes that display no gene loss nor gene duplications. We then test its relevance in the complete set of primate gene families to date gene duplications. Our result are compared to previous fossil and molecular dating approaches, and provide a practical set of guidelines for accurate molecular dating at the single gene family level.
Publisher
Cold Spring Harbor Laboratory