Author:
Sul Seung-Jin,Matthews Suzanne,Williams Tiffani L
Abstract
Abstract
Background
Evolutionary trees are family trees that represent the relationships between a group of organisms. Phylogenetic heuristics are used to search stochastically for the best-scoring trees in tree space. Given that better tree scores are believed to be better approximations of the true phylogeny, traditional evaluation techniques have used tree scores to determine the heuristics that find the best scores in the fastest time. We develop new techniques to evaluate phylogenetic heuristics based on both tree scores and topologies to compare Pauprat and Rec-I-DCM3, two popular Maximum Parsimony search algorithms.
Results
Our results show that although Pauprat and Rec-I-DCM3 find the trees with the same best scores, topologically these trees are quite different. Furthermore, the Rec-I-DCM3 trees cluster distinctly from the Pauprat trees. In addition to our heatmap visualizations of using parsimony scores and the Robinson-Foulds distance to compare best-scoring trees found by the two heuristics, we also develop entropy-based methods to show the diversity of the trees found. Overall, Pauprat identifies more diverse trees than Rec-I-DCM3.
Conclusion
Overall, our work shows that there is value to comparing heuristics beyond the parsimony scores that they find. Pauprat is a slower heuristic than Rec-I-DCM3. However, our work shows that there is tremendous value in using Pauprat to reconstruct trees—especially since it finds identical scoring but topologically distinct trees. Hence, instead of discounting Pauprat, effort should go in improving its implementation. Ultimately, improved performance measures lead to better phylogenetic heuristics and will result in better approximations of the true evolutionary history of the organisms of interest.
Publisher
Springer Science and Business Media LLC
Subject
Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology
Reference17 articles.
1. Bader D, Moret BM, Vawter L: Industrial Applications of High-Performance Computing for Phylogeny Reconstruction. In Proceedings of SPIE Commercial Applications for High-Performance Computing, Denver CO Edited by: Siegel H. 2001, 4528: 159–168.
2. Metzker ML, Mindell DP, Liu XM, Ptak RG, Gibbs RA, Hillis DM: Molecular evidence of HIV-1 transmission in a criminal case. PNAS 2002, 99(2):14292–14297.
3. Nixon KC: The parsimony ratchet, a new method for rapid parsimony analysis. Cladistics 1999, 15: 407–414.
4. Roshan U, Moret BME, Williams TL, Warnow T: A Fast Algorithmic Techniques for Reconstructing Large Phylogenetic Trees. In Proc IEEE Computer Society Bioinformatics Conference (CSB 2004). IEEE Press; 2004:98–109.
5. Bininda-Emonds O: Parsimony Ratchet implementation for PAUP*4.0b10 using Perl.2003. [http://www.uni-oldenburg.de/molekularesystematik/download/goto.php?w=/Programs/perlRat.zip]
Cited by
11 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献