Abstract
AbstractGiven trees T and T* on the same taxon set, the transfer index ϕ(b, T*) is the number of taxa that need to be ignored so that the bipartition induced by branch b in T is equal to some bipartition in T*. Recently, Lemoine et al. [14] used the transfer index to design a novel bootstrap analysis technique that improves on Felsenstein’s bootstrap on large, noisy data sets. In this work, we propose an algorithm that computes the transfer index for all branches b ∈ T in O(n log3n) time, which improves upon the current O(n2)-time algorithm by Lin, Rajan and Moret [15]. Our implementation is able to process pairs of trees with hundreds of thousands of taxa in minutes and considerably speeds up the method of Lemoine et al. on large data sets. We believe our algorithm can be useful for comparing large phylogenies, especially when some taxa are misplaced (e.g. due to horizontal gene transfer, recombination, or reconstruction errors).
Publisher
Cold Spring Harbor Laboratory
Reference21 articles.
1. https://bitbucket.org/thekswenson/rapid_transferindex/.
2. https://github.com/thekswenson/booster.
3. Matching split distance for unrooted binary phylogenetic trees;IEEE/ACM Transactions on Computational Biology and Bioinformatics,2011
4. Computing the quartet distance between evolutionary trees in time O(n log n;Algorithmica,2004
5. Daniel G Brown and Jakub Truszkowski . Fast phylogenetic tree reconstruction using locality-sensitive hashing. In Algorithms in Bioinformatics, pages 14–29. Springer, 2012.