Affiliation:
1. Computational Biology Research Lab (CBRL), Department of Computer Science, National University of Computer and Emerging Sciences, Islamabad 44000, Pakistan
Abstract
Abstract
Motivation
Understanding an enzyme’s function is one of the most crucial problem domains in computational biology. Enzymes are a key component in all organisms and many industrial processes as they help in fighting diseases and speed up essential chemical reactions. They have wide applications and therefore, the discovery of new enzymatic proteins can accelerate biological research and commercial productivity. Biological experiments, to determine an enzyme’s function, are time-consuming and resource expensive.
Results
In this study, we propose a novel computational approach to predict an enzyme’s function up to the fourth level of the Enzyme Commission (EC) Number. Many studies have attempted to predict an enzyme’s function. Yet, no approach has properly tackled the fourth and final level of the EC number. The fourth level holds great significance as it gives us the most specific information of how an enzyme performs its function. Our method uses innovative deep learning approaches along with an efficient hierarchical classification scheme to predict an enzyme’s precise function. On a dataset of 11 353 enzymes and 402 classes, we achieved a hierarchical accuracy and Macro-F1 score of 91.2% and 81.9%, respectively, on the 4th level. Moreover, our method can be used to predict the function of enzyme isoforms with considerable success. This methodology is broadly applicable for genome-wide prediction that can subsequently lead to automated annotation of enzyme databases and the identification of better/cheaper enzymes for commercial activities.
Availability and implementation
The web-server can be freely accessed at http://hecnet.cbrlab.org/.
Supplementary information
Supplementary data are available at Bioinformatics online.
Funder
Higher Education Commission of Pakistan
Ministry of Planning Development and Reforms
National Center in Big Data and Cloud Computing
NCBC
Publisher
Oxford University Press (OUP)
Subject
Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability
Reference38 articles.
1. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000;Bairoch;Nucleic Acids Res,2000
2. Enzymes
3. Learning a similarity metric discriminatively, with application to face verification;Chopra,2005
Cited by
25 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献