Affiliation:
1. Division of Paleontology (Invertebrates), American Museum of Natural History, New York, NY, USA
2. Department of Computer Science, Hunter College, City University of New York, New York, NY, USA
3. Division of Invertebrate Zoology, American Museum of Natural History, New York, NY, USA
Abstract
Abstract
Popular optimality criteria for phylogenetic trees focus on sequences of characters that are applicable to all the taxa. As studies grow in breadth, it can be the case that some characters are applicable for a portion of the taxa and inapplicable for others. Past work has explored the limitations of treating inapplicable characters as missing data, noting that this strategy may favor trees where internal nodes are assigned impossible states, where the arrangement of taxa within subclades is unduly influenced by variation in distant parts of the tree, and/or where taxa that otherwise share most primary characters are grouped distantly. Approaches that avoid the first two problems have recently been proposed. Here, we propose an alternative approach which avoids all three problems. We focus on data matrices that use reductive coding of traits, that is, explicitly incorporate the innate hierarchy induced by inapplicability, and as such our approach extend to hierarchical characters, in general. In the spirit of maximum parsimony, the proposed criterion seeks the phylogenetic tree with the minimal changes across any tree branch, but where changes are defined in terms of dissimilarity metrics that weigh the effects of inapplicable characters. The approach can accommodate binary, multistate, ordered, unordered, and polymorphic characters. We give a polynomial-time algorithm, inspired by Fitch’s algorithm, to score trees under a family of dissimilarity metrics, and prove its correctness. We show that the resulting optimality criteria is computationally hard, by reduction to the NP-hardness of the maximum parsimony optimality criteria. We demonstrate our approach using synthetic and empirical data sets and compare the results with other recently proposed methods for choosing optimal phylogenetic trees when the data includes hierarchical characters. [Character optimization, dissimilarity metrics, hierarchical characters, inapplicable data, phylogenetic tree search.]
Funder
Simons Foundation (K.S.) and the National Science Foundation for funding
Publisher
Oxford University Press (OUP)
Subject
Genetics,Ecology, Evolution, Behavior and Systematics
Cited by
19 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献