Affiliation:
1. School of Information Science and Technology, Northeast Normal University, Changchun, 130024 Jilin, China
2. Graduate School, Northeast Normal University, Changchun 130024, Jilin, China
Abstract
Prediction of protein-protein interaction (PPI) sites is one of the most perplexing problems in drug discovery and computational biology. Although significant progress has been made by combining different machine learning techniques with a variety of distinct characteristics, the problem still remains unresolved. In this study, a technique for PPI sites is presented using a random forest (RF) algorithm followed by the minimum redundancy maximal relevance (mRMR) approach, and the method of incremental feature selection (IFS). Physicochemical properties of proteins and the features of the residual disorder, sequence conservation, secondary structure, and solvent accessibility are incorporated. Five 3D structural characteristics are also used to predict PPI sites. Analysis of features shows that 3D structural features such as relative solvent-accessible surface area (RASA) and surface curvature (SC) help in the prediction of PPI sites. Results show that the performance of the proposed predictor is superior to several other state-of-the-art predictors, whose average prediction accuracy is 81.44%, sensitivity is 82.17%, and specificity is 80.71%, respectively. The proposed predictor is expected to become a helpful tool for finding PPI sites, and the feature analysis presented in this study will give useful insights into protein interaction mechanisms.
Subject
Biochemistry (medical),Clinical Biochemistry,Genetics,Molecular Biology,General Medicine
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献