Affiliation:
1. Department of Computer Science, The University of Western Ontario London, ON N6A 5B7, Canada
2. Department of Biology, McMaster University, Hamilton, ON L8S 4K1, Canada
Abstract
Abstract
Motivation
Proteins usually perform their functions by interacting with other proteins, which is why accurately predicting protein–protein interaction (PPI) binding sites is a fundamental problem. Experimental methods are slow and expensive. Therefore, great efforts are being made towards increasing the performance of computational methods.
Results
We propose DEep Learning Prediction of Highly probable protein Interaction sites (DELPHI), a new sequence-based deep learning suite for PPI-binding sites prediction. DELPHI has an ensemble structure which combines a CNN and a RNN component with fine tuning technique. Three novel features, HSP, position information and ProtVec are used in addition to nine existing ones. We comprehensively compare DELPHI to nine state-of-the-art programmes on five datasets, and DELPHI outperforms the competing methods in all metrics even though its training dataset shares the least similarities with the testing datasets. In the most important metrics, AUPRC and MCC, it surpasses the second best programmes by as much as 18.5% and 27.7%, respectively. We also demonstrated that the improvement is essentially due to using the ensemble model and, especially, the three new features. Using DELPHI it is shown that there is a strong correlation with protein-binding residues (PBRs) and sites with strong evolutionary conservation. In addition, DELPHI’s predicted PBR sites closely match known data from Pfam. DELPHI is available as open-sourced standalone software and web server.
Availability and implementation
The DELPHI web server can be found at delphi.csd.uwo.ca/, with all datasets and results in this study. The trained models, the DELPHI standalone source code, and the feature computation pipeline are freely available at github.com/lucian-ilie/DELPHI.
Supplementary information
Supplementary data are available at Bioinformatics online.
Funder
NSERC Discovery
Research Tools and Instruments Grant
NSERC Discovery Grant
Publisher
Oxford University Press (OUP)
Subject
Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability
Cited by
82 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献