Information theoretic generalized Robinson–Foulds metrics for comparing phylogenetic trees-Reference-Cited by-同舟云学术

Information theoretic generalized Robinson–Foulds metrics for comparing phylogenetic trees

Published:2020-07-03 Issue:20 Volume:36 Page:5007-5013
ISSN:1367-4803
Container-title:Bioinformatics
language:en
Short-container-title:

Author:

Smith Martin R¹^ORCID

Affiliation:

1. Department of Earth Sciences, Lower Mountjoy, Durham University, Durham DH1 3LE, UK

Abstract

Abstract Motivation The Robinson–Foulds (RF) metric is widely used by biologists, linguists and chemists to quantify similarity between pairs of phylogenetic trees. The measure tallies the number of bipartition splits that occur in both trees—but this conservative approach ignores potential similarities between almost-identical splits, with undesirable consequences. ‘Generalized’ RF metrics address this shortcoming by pairing splits in one tree with similar splits in the other. Each pair is assigned a similarity score, the sum of which enumerates the similarity between two trees. The challenge lies in quantifying split similarity: existing definitions lack a principled statistical underpinning, resulting in misleading tree distances that are difficult to interpret. Here, I propose probabilistic measures of split similarity, which allow tree similarity to be measured in natural units (bits). Results My new information-theoretic metrics outperform alternative measures of tree similarity when evaluated against a broad suite of criteria, even though they do not account for the non-independence of splits within a single tree. Mutual clustering information exhibits none of the undesirable properties that characterize other tree comparison metrics, and should be preferred to the RF metric. Availability and implementation The methods discussed in this article are implemented in the R package ‘TreeDist’, archived at https://dx.doi.org/10.5281/zenodo.3528123. Supplementary information Supplementary data are available at Bioinformatics online.

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Link

http://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btaa614/33866911/btaa614.pdf

Reference42 articles.

1. The Generalized Robinson-Foulds Metric

2. Matching split distance for unrooted binary phylogenetic trees;Bogdanowicz;IEEE/ACM Trans. Comput. Biol. Bioinform,2012

3. On a matching distance between rooted phylogenetic trees;Bogdanowicz;Int. J. Appl. Math. Comput. Sci,2013

4. Comparing phylogenetic trees by matching nodes using the transfer distance between partitions;Bogdanowicz;J. Comput. Biol,2017

Cited by 101 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Robustness and evolvability: Revisited, redefined and applied;BioSystems;2024-12

2. Estimating the mean in the space of ranked phylogenetic trees;Bioinformatics;2024-08

3. Phylogeny structures species' interactions in experimental ecological communities;Ecology Letters;2024-08

4. Phylogenomics of the pantropical Connaraceae: revised infrafamilial classification and the evolution of heterostyly;Plant Systematics and Evolution;2024-08

5. Organ systems of a Cambrian euarthropod larva;Nature;2024-07-31