Abstract
In the past decade, ancient protein sequences have emerged as a valuable source of data for deep-time phylogenetic inference. Still, the recovery of protein sequences providing novel phylogenetic insights does not exceed 3.7 Ma (Pliocene). Here, we push this boundary back to 21-24 Ma (early Miocene), by retrieving enamel protein sequences of an early-diverging rhinocerotid (Epiaceratheriumsp. - CMNF-59632) from the Canadian High Arctic. We recover partial sequences of seven enamel proteins (AHSG, ALB, AMBN, AMELX, AMTN, ENAM, MMP20) and over 1000 peptide-spectrum matches, spanning over at least 251 amino acids. Authentic endogeneity of these sequences is supported by indicators of protein damage, including several spontaneous and irreversible post-translational modifications accumulated during prolonged diagenesis and reaching near-complete occupancy at many sites. Bayesian tip-dating, across 15 extant and extinct perissodactyl taxa, places the divergence time of CMNF-59632 in the middle Eocene-Oligocene, and identifies a later divergence time for Elasmotheriinae in the Oligocene. The finding weakens alternative models suggesting a deep basal split between Elasmotheriinae and Rhinocerotinae. This divergence time of CMNF-59632 coincides with a phase of high diversification of rhinocerotids, and supports a Eurasian origin of this clade in the late Eocene or Oligocene. The findings are consistent with previous hypotheses on the origin of the enigmatic fauna of the Haughton crater, which, in spite of their considerable degree of endemism, also display similarity to distant Eurasian faunas. Our findings demonstrate the potential of palaeoproteomics in obtaining phylogenetic information from a specimen that is ten times older than any sample from which endogenous DNA has been obtained.
Publisher
Cold Spring Harbor Laboratory