The Influence of the Number of Tree Searches on Maximum Likelihood Inference in Phylogenomics

Author:

Liu Chao12,Zhou Xiaofan3ORCID,Li Yuanning45ORCID,Hittinger Chris Todd6ORCID,Pan Ronghui7,Huang Jinyan8,Chen Xue-xin1ORCID,Rokas Antonis5ORCID,Chen Yun1ORCID,Shen Xing-Xing12ORCID

Affiliation:

1. Department of Plant Protection, Key Laboratory of Biology of Crop Pathogens and Insects of Zhejiang Province, Zhejiang University , Hangzhou 310058 , China

2. Centre for Evolutionary & Organismal Biology, Zhejiang University , Hangzhou 310058 , China

3. Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Centre, South China Agricultural University , Guangzhou 510642 , China

4. Institute of Marine Science and Technology, Shandong University , Qingdao 266237 , China

5. Department of Biological Sciences and Evolutionary Studies Initiative, Vanderbilt University , Nashville, TN 37235 , USA

6. Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, DOE Great Lakes Bioenergy Research Center, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison , Madison, WI 53706 , USA

7. ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University, Hangzhou , 310027 , China

8. Zhejiang Provincial Key Laboratory of Pancreatic Disease, Zhejiang University School of Medicine First Affiliated Hospital , Hangzhou 310003 , China

Abstract

Abstract Maximum likelihood (ML) phylogenetic inference is widely used in phylogenomics. As heuristic searches most likely find suboptimal trees, it is recommended to conduct multiple (e.g., 10) tree searches in phylogenetic analyses. However, beyond its positive role, how and to what extent multiple tree searches aid ML phylogenetic inference remains poorly explored. Here, we found that a random starting tree was not as effective as the BioNJ and parsimony starting trees in inferring the ML gene tree and that RAxML-NG and PhyML were less sensitive to different starting trees than IQ-TREE. We then examined the effect of the number of tree searches on ML tree inference with IQ-TREE and RAxML-NG, by running 100 tree searches on 19,414 gene alignments from 15 animal, plant, and fungal phylogenomic datasets. We found that the number of tree searches substantially impacted the recovery of the best-of-100 ML gene tree topology among 100 searches for a given ML program. In addition, all of the concatenation-based trees were topologically identical if the number of tree searches was ≥10. Quartet-based ASTRAL trees inferred from 1 to 80 tree searches differed topologically from those inferred from 100 tree searches for 6/15 phylogenomic datasets. Finally, our simulations showed that gene alignments with lower difficulty scores had a higher chance of finding the best-of-100 gene tree topology and were more likely to yield the correct trees.

Funder

National Key R&D Program of China

National Science Foundation for Distinguished Young Scholars of Zhejiang Province

Fundamental Research Funds for the Central Universities

Zhejiang Lab

Research and Development Program of Guangdong Province

USDA National Institute of Food and Agriculture

DOE Great Lakes Bioenergy Research Center

National Natural Science Foundation of China

Leading Innovative and Entrepreneur Team Introduction Program of Zhejiang

Key International Joint Research Program of

National Institutes of Health/National Institute of Allergy and Infectious Diseases

Burroughs Wellcome Fund

Publisher

Oxford University Press (OUP)

Reference53 articles.

1. Explosive diversification of marine fishes at the Cretaceous–Palaeogene boundary;Alfaro;Nat. Ecol. Evol,2018

2. Subtree transfer operations and their induced metrics on evolutionary trees;Allen;Ann. Combinatorics,2001

3. Multiple origins of sexual dichromatism and aposematism within large carpenter bees;Blaimer;Evol. Int. J. Org. Evol,2018

4. Accounting for uncertainty in gene tree estimation: summary-coalescent species tree inference in a challenging radiation of australian lizards;Blom;Syst. Biol,2017

5. Maximum likelihood of evolutionary trees: hardness and approximation;Chor;Bioinformatics,2005

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3