Towards structured output prediction of enzyme function-Reference-Cited by-同舟云学术

Towards structured output prediction of enzyme function

Published:2008-12 Issue:S4 Volume:2 Page:
ISSN:1753-6561
Container-title:BMC Proceedings
language:en
Short-container-title:BMC Proc

Author:

Astikainen Katja,Holm Liisa,Pitkänen Esa,Szedmak Sandor,Rousu Juho

Abstract

Abstract Background In this paper we describe work in progress in developing kernel methods for enzyme function prediction. Our focus is in developing so called structured output prediction methods, where the enzymatic reaction is the combinatorial target object for prediction. We compared two structured output prediction methods, the Hierarchical Max-Margin Markov algorithm (HM3) and the Maximum Margin Regression algorithm (MMR) in hierarchical classification of enzyme function. As sequence features we use various string kernels and the GTG feature set derived from the global alignment trace graph of protein sequences. Results In our experiments, in predicting enzyme EC classification we obtain over 85% accuracy (predicting the four digit EC code) and over 91% microlabel F1 score (predicting individual EC digits). In predicting the Gold Standard enzyme families, we obtain over 79% accuracy (predicting family correctly) and over 89% microlabel F1 score (predicting superfamilies and families). In the latter case, structured output methods are significantly more accurate than nearest neighbor classifier. A polynomial kernel over the GTG feature set turned out to be a prerequisite for accurate function prediction. Combining GTG with string kernels boosted accuracy slightly in the case of EC class prediction. Conclusion Structured output prediction with GTG features is shown to be computationally feasible and to have accuracy on par with state-of-the-art approaches in enzyme function prediction.

Publisher

Springer Science and Business Media LLC

Subject

General Biochemistry, Genetics and Molecular Biology,General Medicine

Link

https://link.springer.com/content/pdf/10.1186/1753-6561-2-s4-s2.pdf

Reference31 articles.

1. Palsson B: Systems Biology: Properties of Reconstructed Networks. 2006, Cambridge University Press New York, NY, USA

2. Ashburner M, Ball C, Blake J, Botstein D, Butler H, Cherry J, Davis A, Dolinski K, Dwight S, Eppig J, et al: Gene Ontology: tool for the unification of biology. Nature Genetics. 2000, 25: 25-29.

3. Guldener U, Munsterkotter M, Kastenmuller G, Strack N, van Helden J, Lemer C, Richelles J, Wodak S, Garcia-Martinez J, Perez-Ortin J, et al: CYGD: the Comprehensive Yeast Genome Database. Nucleic Acids Research. 2005, D364-33 Database

4. Lanckriet G, Deng M, Cristianini N, Jordan M, Noble W: Kernel-based data fusion and its application to protein function prediction in yeast. Proceedings of the Pacific Symposium on Biocomputing. 2004, 2004:

5. Borgwardt KM, Ong CS, Schönauer S, Vishwanathan SVN, Smola AJ, Kriegel HP: Protein function prediction via graph kernels. Bioinformatics. 2005, 21 (Suppl 1): i47-i56.

Cited by 18 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods;BMC Bioinformatics;2017-10-12

2. GOstruct 2.0;Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics;2017-08-20

3. Machine Learning Approach to Predict Enzyme Subclasses;Multi-Scale Approaches in Drug Discovery;2017

4. Can computer vision problems benefit from structured hierarchical classification?;Machine Vision and Applications;2016-05-06

5. Scalable, accurate image annotation with joint SVMs and output kernels;Neurocomputing;2015-12