ARPIP: Ancestral Sequence Reconstruction with Insertions and Deletions under the Poisson Indel Process

Author:

Jowkar Gholamhossein123,Pečerska Jūlija12,Maiolo Massimo124,Gil Manuel12,Anisimova Maria12

Affiliation:

1. School of Life Sciences and Facility Management Zurich University of Applied Sciences, , CH-8820 Wädenswil, Switzerland

2. Swiss Institute of Bioinformatics , CH-1015 Lausanne, Switzerland

3. Institute of Biology University of Neuchâtel, , CH-2000 Neuchâtel, Switzerland

4. Institute of Pathology University of Bern, , CH-3008 Bern, Switzerland

Abstract

Abstract Modern phylogenetic methods allow inference of ancestral molecular sequences given an alignment and phylogeny relating present-day sequences. This provides insight into the evolutionary history of molecules, helping to understand gene function and to study biological processes such as adaptation and convergent evolution across a variety of applications. Here, we propose a dynamic programming algorithm for fast joint likelihood-based reconstruction of ancestral sequences under the Poisson Indel Process (PIP). Unlike previous approaches, our method, named ARPIP, enables the reconstruction with insertions and deletions based on an explicit indel model. Consequently, inferred indel events have an explicit biological interpretation. Likelihood computation is achieved in linear time with respect to the number of sequences. Our method consists of two steps, namely finding the most probable indel points and reconstructing ancestral sequences. First, we find the most likely indel points and prune the phylogeny to reflect the insertion and deletion events per site. Second, we infer the ancestral states on the pruned subtree in a manner similar to FastML. We applied ARPIP (Ancestral Reconstruction under PIP) on simulated data sets and on real data from the Betacoronavirus genus. ARPIP reconstructs both the indel events and substitutions with a high degree of accuracy. Our method fares well when compared to established state-of-the-art methods such as FastML and PAML. Moreover, the method can be extended to explore both optimal and suboptimal reconstructions, include rate heterogeneity through time and more. We believe it will expand the range of novel applications of ancestral sequence reconstruction. [Ancestral sequences; dynamic programming; evolutionary stochastic process; indel; joint ancestral sequence reconstruction; maximum likelihood; Poisson Indel Process; phylogeny; SARS-CoV.]

Funder

Swiss National Science Foundation

Publisher

Oxford University Press (OUP)

Subject

Genetics,Ecology, Evolution, Behavior and Systematics

Reference46 articles.

1. Fastml: a web server for probabilistic reconstruction of ancestral sequences;Ashkenazy;Nucleic Acids Res.,2012

2. Uniprot: the universal protein knowledgebase in 2021;Bateman;Nucleic Acids Res.,2020

3. Mechanisms of coronavirus cell entry mediated by the viral spike protein;Belouzard;Viruses,2012

4. Probabilistic models of evolution and language change [Ph.D. Thesis];Bouchard-Côté,2010

5. Evolutionary inference via the Poisson indel process;Bouchard-Côté;Proc. Natl. Acad. Sci. USA,2013

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3