Affiliation:
1. Department of Microbiology and Cell Science, Institute of Food and Agricultural Sciences, University of Florida
Abstract
AbstractAncestral sequence reconstruction (ASR) uses an alignment of extant protein sequences, a phylogeny describing the history of the protein family and a model of the molecular-evolutionary process to infer the sequences of ancient proteins, allowing researchers to directly investigate the impact of sequence evolution on protein structure and function. Like all statistical inferences, ASR can be sensitive to violations of its underlying assumptions. Previous studies have shown that, whereas phylogenetic uncertainty has only a very weak impact on ASR accuracy, uncertainty in the protein sequence alignment can more strongly affect inferred ancestral sequences. Here, we show that errors in sequence alignment can produce errors in ASR across a range of realistic and simplified evolutionary scenarios. Importantly, sequence reconstruction errors can lead to errors in estimates of structural and functional properties of ancestral proteins, potentially undermining the reliability of analyses relying on ASR. We introduce an alignment-integrated ASR approach that combines information from many different sequence alignments. We show that integrating alignment uncertainty improves ASR accuracy and the accuracy of downstream structural and functional inferences, often performing as well as highly accurate structure-guided alignment. Given the growing evidence that sequence alignment errors can impact the reliability of ASR studies, we recommend that future studies incorporate approaches to mitigate the impact of alignment uncertainty. Probabilistic modeling of insertion and deletion events has the potential to radically improve ASR accuracy when the model reflects the true underlying evolutionary history, but further studies are required to thoroughly evaluate the reliability of these approaches under realistic conditions.
Funder
National Science Foundation
Publisher
Oxford University Press (OUP)
Subject
Genetics,Ecology, Evolution, Behavior and Systematics
Reference62 articles.
1. ProtASR: an evolutionary framework for ancestral protein reconstruction with selection on folding stability;Arenas;Syst Biol,2017
2. FastML: a web server for probabilistic reconstruction of ancestral sequences;Ashkenazy;Nucleic Acids Res,2012
3. Detecting selection on protein stability through statistical mechanical models of folding and evolution;Bastolla;Biomolecules,2014
4. The protein data bank;Berman;Nucleic Acids Res,2000
5. Measuring the distance between multiple sequence alignments;Blackburne;Bioinformatics,2012
Cited by
18 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献