Abstract
ABSTRACTHere, we investigate the contributions of coevolutive, evolutive and stochastic information in determining protein-protein interactions (PPIs) based on primary sequences of two interacting protein families A and B. Specifically, under the assumption that coevolutive information is imprinted on the interacting amino acids of two proteins in contrast to other (evolutive and stochastic) sources spread over their sequences, we dissect those contributions in terms of compensatory mutations at physically-coupled and uncoupled amino acids of A and B. We find that physically-coupled amino-acids at short range distances store the largest per-contact mutual information content, with a significant fraction of that content resulting from coevolutive sources alone. The information stored in coupled amino acids is shown further to discriminate multi-sequence alignments (MSAs) with the largest expectation fraction of PPI matches – a conclusion that holds against various definitions of intermolecular contacts and binding modes. When compared to the informational content resulting from evolution at long-range interactions, the mutual information in physically-coupled amino-acids is the strongest signal to distinguish PPIs derived from cospeciation and likely, the unique indication in case of molecular coevolution in independent genomes as the evolutive information must vanish for uncorrelated proteins.SIGNIFICANCEThe problem of predicting protein-protein interactions (PPIs) based on multi-sequence alignments (MSAs) appears not completely resolved to date. In previous studies, one or more sources of information were taken into account not clarifying the isolated contributions of coevolutive, evolutive and stochastic information in resolving the problem. By benefiting from data sets made available in the sequence- and structure-rich era, we revisit the field to show that physically-coupled amino-acids of proteins store the largest (per contact) information content to discriminate MSAs with the largest expectation fraction of PPI matches – a result that should guide new developments in the field, aiming at characterizing protein interactions in general.
Publisher
Cold Spring Harbor Laboratory