PROTEIN STRUCTURE AND FOLD PREDICTION USING TREE-AUGMENTED NAÏVE BAYESIAN CLASSIFIER-Reference-Cited by-同舟云学术

PROTEIN STRUCTURE AND FOLD PREDICTION USING TREE-AUGMENTED NAÏVE BAYESIAN CLASSIFIER

Published:2005-08 Issue:04 Volume:03 Page:803-819
ISSN:0219-7200
Container-title:Journal of Bioinformatics and Computational Biology
language:en
Short-container-title:J. Bioinform. Comput. Biol.

Author:

CHINNASAMY ARUNKUMAR¹,SUNG WING-KIN¹,MITTAL ANKUSH²

Affiliation:

1. Department of Computer Science, National University of Singapore, Singapore 117543, Singapore

2. Department of Electronics and Computer Engineering, Indian Institute of Technology, Roorkee, India

Abstract

Due to the large volume of protein sequence data, computational methods to determine the structure class and the fold class of a protein sequence have become essential. Several techniques based on sequence similarity, Neural Networks, Support Vector Machines (SVMs), etc. have been applied. Since most of these classifiers use binary classifiers for multi-classification, there may be Nc2 classifiers required. This paper presents a framework using the Tree-Augmented Bayesian Networks (TAN) which performs multi-classification based on the theory of learning Bayesian Networks and using improved feature vector representation of (Ding et al., 2001).4 In order to enhance TAN's performance, pre-processing of data is done by feature discretization and post-processing is done by using Mean Probability Voting (MPV) scheme. The advantage of using Bayesian approach over other learning methods is that the network structure is intuitive. In addition, one can read off the TAN structure probabilities to determine the significance of each feature (say, hydrophobicity) for each class, which helps to further understand the complexity in protein structure. The experiments on the datasets used in three prominent recent works show that our approach is more accurate than other discriminative methods. The framework is implemented on the BAYESPROT web server and it is available at . More detailed results are also available on the above website.

Publisher

World Scientific Pub Co Pte Lt

Subject

Computer Science Applications,Molecular Biology,Biochemistry

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0219720005001302

Reference18 articles.

1. Addressing the problems of Bayesian network classification of video using high-dimensional features

2. Recognition of a protein fold in the context of the SCOP classification

3. Prediction of protein folding class using global description of amino acid sequence.

Cited by 39 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An Enhanced Protein Fold Recognition for Low Similarity Datasets Using Convolutional and Skip-Gram Features With Deep Neural Network;IEEE Transactions on NanoBioscience;2021-01

2. Prediction of protein structural classes by different feature expressions based on 2-D wavelet denoising and fusion;BMC Bioinformatics;2019-12

3. Predicting subcellular localization of multi-label proteins by incorporating the sequence features into Chou's PseAAC;Genomics;2019-12

4. Binary classification of imbalanced datasets: The case of CoIL challenge 2000;Expert Systems with Applications;2019-08

5. Predicting protein structural classes for low-similarity sequences by evaluating different features;Knowledge-Based Systems;2019-01