Author:
Jia Kejue,Jernigan Robert L.
Abstract
SUMMARYProtein sequence matching does not properly account for some well-known features of protein structures: surface residues being more variable than core residues, the high packing densities in globular proteins, and does not yield good matches of sequences of many proteins known to be close structural relatives. There are now abundant protein sequences and structures to enable major improvements to sequence matching. Here, we utilize structural frameworks to mount the observed correlated sequences to identify the most important correlated parts. The rationale is that protein structures provide the important physical framework for improving sequence matching. Combining the sequence and structure data in this way leads to a simple amino acid substitution matrix that can be readily incorporated into any sequence matching. This enables the incorporation of allosteric information into sequence matching and transforms it effectively from a 1-D to a 3-D procedure. The results from testing in over 3,000 sequence matches demonstrate a 37% gain in sequence similarity and a loss of 26% of the gaps when compared with the use of BLOSUM62. And, importantly there are major gains in the specificity of sequence matching across diverse proteins. Specifically, all known cases where protein structures match but sequences do not match well are resolved.
Publisher
Cold Spring Harbor Laboratory
Reference83 articles.
1. Persistently conserved positions in structurally similar, sequence dissimilar proteins: roles in preserving protein fold and function;Protein science: a publication of the Protein Society,2002
2. Bahar, I. , Jernigan, R. & Dill, K. Protein Actions: Principles & Modeling. GarlandScience, NY, Fig 8.1, P 182 (2017).
3. Protein-Structure Prediction by Recombination of Fragments
4. Evaluation of template-based models in CASP8 with standard measures
5. Evaluation of CASP8 model quality predictions
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献