Abstract
Abstract
Background
Stability is one of the most fundamental intrinsic characteristics of proteins and can be determined with various methods. Characterization of protein properties does not keep pace with increase in new sequence data and therefore even basic properties are not known for far majority of identified proteins. There have been some attempts to develop predictors for protein stabilities; however, they have suffered from small numbers of known examples.
Results
We took benefit of results from a recently developed cellular stability method, which is based on limited proteolysis and mass spectrometry, and developed a machine learning method using gradient boosting of regression trees. ProTstab method has high performance and is well suited for large scale prediction of protein stabilities.
Conclusions
The Pearson’s correlation coefficient was 0.793 in 10-fold cross validation and 0.763 in independent blind test. The corresponding values for mean absolute error are 0.024 and 0.036, respectively. Comparison with a previously published method indicated ProTstab to have superior performance. We used the method to predict stabilities of all the remaining proteins in the entire human proteome and then correlated the predicted stabilities to protein chain lengths of isoforms and to localizations of proteins.
Funder
Vetenskapsrådet
National Natural Science Foundation of China
University Natural Science Research Project of Anhui Province
Publisher
Springer Science and Business Media LLC
Reference52 articles.
1. Gorania M, Seker H, Haris PI. Predicting a protein’s melting temperature from its amino acid sequence. Conf Proc IEEE Eng Med Biol Soc. 2010;2010:1820–3.
2. Ku T, Lu P, Chan C, Wang T, Lai S, Lyu P, Hsiao N. Predicting melting temperature directly from protein sequences. Comput Biol Chem. 2009;33(6):445–50.
3. Ghosh K, Dill KA. Computing protein stabilities from their chain lengths. Proc Natl Acad Sci U S A. 2009;106(26):10649–54.
4. Robertson AD, Murphy KP. Protein structure and the energetics of protein stability. Chem Rev. 1997;97(5):1251–68.
5. Ebrahimi M, Lakizadeh A, Agha-Golzadeh P, Ebrahimie E, Ebrahimi M. Prediction of thermostability from amino acid attributes by combination of clustering with attribute weighting: a new vista in engineering enzymes. PLoS One. 2011;6(8):e23146.
Cited by
19 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献