Abstract
AbstractInferring a phylogenetic tree is a fundamental challenge in evolutionary studies. Current paradigms for phylogenetic tree reconstruction rely on performing costly likelihood optimizations. With the aim of making tree inference feasible for problems involving more than a handful of sequences, inference under the maximum-likelihood paradigm integrates heuristic approaches to evaluate only a subset of all potential trees. Consequently, existing methods suffer from the known tradeoff between accuracy and running time. In this proof-of-concept study, we train a machine-learning algorithm over an extensive cohort of empirical data to predict the neighboring trees that increase the likelihood, without actually computing their likelihood. This provides means to safely discard a large set of the search space, thus potentially accelerating heuristic tree searches without losing accuracy. Our analyses suggest that machine learning can guide tree-search methodologies towards the most promising candidate trees.
Funder
Israel Science Foundation
Publisher
Springer Science and Business Media LLC
Subject
General Physics and Astronomy,General Biochemistry, Genetics and Molecular Biology,General Chemistry
Reference51 articles.
1. Thorne, J. L. Models of protein sequence evolution and their applications. Curr. Opin. Genet. Dev. 10, 602–605 (2000).
2. Felsenstein, J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17, 368–376 (1981).
3. Chor, B. & Tuller, T. Maximum likelihood of evolutionary trees: Hardness and approximation. Bioinformatics 21, i97–i106 (2005).
4. Saitou, N. & Nei, M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987).
5. Ogden, T. H. & Rosenberg, M. S. Multiple sequence alignment accuracy and phylogenetic inference. Syst. Biol. 55, 314–328 (2006).
Cited by
33 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献