Relevant and Non-Redundant Amino Acid Sequence Selection for Protein Functional Site Identification-Reference-Cited by-同舟云学术

Relevant and Non-Redundant Amino Acid Sequence Selection for Protein Functional Site Identification

Published:2010-04 Issue:2 Volume:2 Page:19-43
ISSN:1942-9045
Container-title:International Journal of Software Science and Computational Intelligence
language:en
Short-container-title:

Author:

Das Chandra¹,Maji Pradipta²

Affiliation:

1. West Bengal University of Technology, India

2. Indian Statistical Institute, India

Abstract

In order to apply a powerful pattern recognition algorithm to predict functional sites in proteins, amino acids cannot be used directly as inputs since they are non-numerical variables. Therefore, they need encoding prior to input. In this regard, the bio-basis function maps a non-numerical sequence space to a numerical feature space. One of the important issues for the bio-basis function is how to select a minimum set of bio-basis strings with maximum information. In this paper, an efficient method to select bio-basis strings for the bio-basis function is described integrating the concepts of the Fisher ratio and “degree of resemblance”. The integration enables the method to select a minimum set of most informative bio-basis strings. The “degree of resemblance” enables efficient selection of a set of distinct bio-basis strings. In effect, it reduces the redundant features in numerical feature space. Quantitative indices are proposed for evaluating the quality of selected bio-basis strings. The effectiveness of the proposed bio-basis string selection method, along with a comparison with existing methods, is demonstrated on different data sets.

Publisher

IGI Global

Subject

Pharmacology (medical)

Reference37 articles.

1. Issues in searching molecular sequence databases

2. Basic Local Alignment Search Tool.;S. F.Altschul;Journal of Molecular Biology,1990

3. Identification of A New Motif on Nucleic Acid Sequence Data Using Kohonen’s Self-Organising Map.;P.Arrigo;CABIOS,1991

4. Matching Protein Beta-Sheet Partners by Feedforwardand Recurrent Neural Networks. In;P.Baldi;Proceedings of the International Conference on Intelligent Systems for Molecular Biology,1995