Detecting coevolution without phylogenetic trees? Tree-ignorant metrics of coevolution perform as well as tree-aware metrics-Reference-Cited by-同舟云学术

Detecting coevolution without phylogenetic trees? Tree-ignorant metrics of coevolution perform as well as tree-aware metrics

Published:2008-12 Issue:1 Volume:8 Page:
ISSN:1471-2148
Container-title:BMC Evolutionary Biology
language:en
Short-container-title:BMC Evol Biol

Author:

Caporaso J Gregory,Smit Sandra,Easton Brett C,Hunter Lawrence,Huttley Gavin A,Knight Rob

Abstract

Abstract Background Identifying coevolving positions in protein sequences has myriad applications, ranging from understanding and predicting the structure of single molecules to generating proteome-wide predictions of interactions. Algorithms for detecting coevolving positions can be classified into two categories: tree-aware, which incorporate knowledge of phylogeny, and tree-ignorant, which do not. Tree-ignorant methods are frequently orders of magnitude faster, but are widely held to be insufficiently accurate because of a confounding of shared ancestry with coevolution. We conjectured that by using a null distribution that appropriately controls for the shared-ancestry signal, tree-ignorant methods would exhibit equivalent statistical power to tree-aware methods. Using a novel t-test transformation of coevolution metrics, we systematically compared four tree-aware and five tree-ignorant coevolution algorithms, applying them to myoglobin and myosin. We further considered the influence of sequence recoding using reduced-state amino acid alphabets, a common tactic employed in coevolutionary analyses to improve both statistical and computational performance. Results Consistent with our conjecture, the transformed tree-ignorant metrics (particularly Mutual Information) often outperformed the tree-aware metrics. Our examination of the effect of recoding suggested that charge-based alphabets were generally superior for identifying the stabilizing interactions in alpha helices. Performance was not always improved by recoding however, indicating that the choice of alphabet is critical. Conclusion The results suggest that t-test transformation of tree-ignorant metrics can be sufficient to control for patterns arising from shared ancestry.

Publisher

Springer Science and Business Media LLC

Subject

Ecology, Evolution, Behavior and Systematics

Link

https://link.springer.com/content/pdf/10.1186/1471-2148-8-327.pdf

Reference47 articles.

1. Freyhult E, Moulton V, Gardner PP: Predicting RNA structure using mutual information. Appl Bioinformatics. 2005, 4: 53-59.

2. Lindgreen S, Gardner PP, Krogh A: Measuring covariation in RNA alignments: physical realism improves information measures. Bioinformatics. 2006, 22 (24): 2988-2995.

3. Yeang CH, Darot JFJ, Noller HF, Haussler D: Detecting the coevolution of biosequences-an example of RNA interaction prediction. Mol Biol Evol. 2007, 24 (9): 2119-2131.

4. Shindyalov IN, Kolchanov NA, Sander C: Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations?. Protein Engineering. 1994, 7 (3): 349-358.

5. Pollock DD, Taylor WR, Goldman N: Coevolving protein residues: maximum likelihood identification and relationship to structure. J Mol Biol. 1999, 287: 187-198.

Cited by 25 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Experimental determination and data-driven prediction of homotypic transmembrane domain interfaces;Computational and Structural Biotechnology Journal;2020

2. Coevolutionary analyses require phylogenetically deep alignments and better null models to accurately detect inter-protein contacts within and between species;BMC Bioinformatics;2015-08-25

3. Deep Sequencing of Protease Inhibitor Resistant HIV Patient Isolates Reveals Patterns of Correlated Mutations in Gag and Protease;PLOS Computational Biology;2015-04-20

4. Allosteric signalling in the outer membrane translocation domain of PapC usher;eLife;2014-10-01

5. Multidimensional mutual information methods for the analysis of covariation in multiple sequence alignments;BMC Bioinformatics;2014-05-22