Affiliation:
1. School of Mathematics and Physics Science and Engineering, Hebei University of Engineering, Handan 056038, China
Abstract
Aims:
Based on protein sequence information, a simple and effective method was used
to analyze protein sequence similarity and predict DNA-binding protein.
Background:
It is absolutely necessary that we generate computational methods of low complexity
to accurate infer protein structure, function, and evolution in the rapidly growing number of
molecular biology data available.
Objective:
It is important to generate novel computational algorithms for analyzing and comparing
protein sequences with the rapidly growing number of molecular biology data available.
Method:
Based on global and local position representation with the curves of Fermat spiral and
normalized moments of inertia of the curve of Fermat spiral, respectively, moreover, composition
of 20 amino acids to get the numerical characteristics of protein sequences.
Results:
It has been applied to analyze the similarity/dissimilarity of nine ND5 proteins, the
analysis results are consistent with the biological evolution theory. Furthermore, we employ the
Logistic regression with 5-fold cross-validation to establish the prediction of DNA-binding
proteins model, which outperformed the DNAbinder, iDNA-prot, DNA-prot and gDNA-prot by
0.0069-0.609 in terms of F-measure, 0.293-0.898 in terms of MCC in unbalanced dataset.
Conclusion:
These results show that our method, namely FermatS, is effective to compare,
recognition and prediction the protein sequences.
Funder
Natural Science Foundation Project of Hebei
Department of Education in Hebei
National Natural Science Foundation of China
Publisher
Bentham Science Publishers Ltd.
Subject
Organic Chemistry,Computer Science Applications,Drug Discovery,General Medicine
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献