Affiliation:
1. Periyar University, India
Abstract
Nowadays there are many people affected by the genetic disorder, hereditary diseases, etc. The protein complexes and their functions are detected, in order to find the irregularity in the gene expression. In a group of related proteins, there exist some conserved sequence patterns (motifs) either functionally or structurally similar. The main objective of this work is to find the motif information from the given protein sequence dataset. The functionalities of the proteins are ideally found from their motif information. Clustering approach is a main data mining technique. Besides the clustering approach, the biclustering is also used in many Bioinformatics related research works. The PSO K-Means clustering and biclustering approach is proposed in this work to extract the motif information. The Motif is extracted based on the structure homogeneity of the protein sequence. In this work, the clusters and biclusters are compared based on homogeneity and motif information extracted. This study shows that biclustering approach yields better result than the clustering approach.
Reference28 articles.
1. Combining PSO and k-means to enhance data clustering
2. Particle Swarm Optimization Algorithm Based K-Means and Fuzzy c-means clustering.;International Journal of Advanced Research in Computer Science and Software Engineering,2013
3. B.Chen, S.Pellicer, P.C.Tai, R.Harrison, & Y.Pan. (2009). Novel efficient granular computing models for protein sequence motifs and structure information discovery. International Journal of Computational Biology and Drug Design.
4. Protein Sequence Motif Extraction using Decision Forest.;B.Cheng;World Congress in Computer Science, Computer Engineering,2011
5. Bairoch, A., Bucher, P., & Hofmann, K. (1996). The PROSITE database, its status in 1995. Nucleic Acids Research, 24(1), 189-196.