Impact of phylogeny on structural contact inference from protein sequence data-Reference-Cited by-同舟云学术

Impact of phylogeny on structural contact inference from protein sequence data

Published:2022-09-27 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Dietler Nicola^ORCID,Lupo Umberto^ORCID,Bitbol Anne-Florence^ORCID

Abstract

AbstractLocal and global inference methods have been developed to infer structural contacts from multiple sequence alignments of homologous proteins. They rely on correlations in amino-acid usage at contacting sites. Because homologous proteins share a common ancestry, their sequences also feature phylogenetic correlations, which can impair contact inference. We investigate this effect by generating controlled synthetic data from a minimal model where the importance of contacts and of phylogeny can be tuned. We demonstrate that global inference methods, specifically Potts models, are more resilient to phylogenetic correlations than local methods, based on covariance or mutual information. This holds whether or not phylogenetic corrections are used, and may explain the success of global methods. We analyse the roles of selection strength and of phylogenetic relatedness. We show that sites that mutate early in the phylogeny yield false positive contacts. We consider natural data and realistic synthetic data, and our findings generalise to these cases. Our results highlight the impact of phylogeny on contact prediction from protein sequences and illustrate the interplay between the rich structure of biological data and inference.

Publisher

Cold Spring Harbor Laboratory

Reference52 articles.

1. Inferring couplings in networks across order-disorder phase transitions;Phys. Rev. Research,2022

2. A method to predict functional residues in proteins

3. Protein Sectors: Evolutionary Units of Three-Dimensional Structure

4. Power law tails in phylogenetic systems

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Impact of phylogeny on structural contact inference from protein sequence data;Journal of The Royal Society Interface;2023-02