Affiliation:
1. Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University
2. Key Laboratory of System Control and Information Processing, Ministry of Education of China, 200240 Shanghai, China
Abstract
Abstract
Motivation
Knowledge of protein–ligand binding residues is important for understanding the functions of proteins and their interaction mechanisms. From experimentally solved protein structures, how to accurately identify its potential binding sites of a specific ligand on the protein is still a challenging problem. Compared with structure-alignment-based methods, machine learning algorithms provide an alternative flexible solution which is less dependent on annotated homogeneous protein structures. Several factors are important for an efficient protein–ligand prediction model, e.g. discriminative feature representation and effective learning architecture to deal with both the large-scale and severely imbalanced data.
Results
In this study, we propose a novel deep-learning-based method called DELIA for protein–ligand binding residue prediction. In DELIA, a hybrid deep neural network is designed to integrate 1D sequence-based features with 2D structure-based amino acid distance matrices. To overcome the problem of severe data imbalance between the binding and nonbinding residues, strategies of oversampling in mini-batch, random undersampling and stacking ensemble are designed to enhance the model. Experimental results on five benchmark datasets demonstrate the effectiveness of proposed DELIA pipeline.
Availability and implementation
The web server of DELIA is available at www.csbio.sjtu.edu.cn/bioinf/delia/.
Supplementary information
Supplementary data are available at Bioinformatics online.
Funder
National Key Research and Development Program of China
National Natural Science Foundation of China
Science and Technology Commission of Shanghai Municipality
Publisher
Oxford University Press (OUP)
Subject
Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability
Cited by
46 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献