Novel Protein Sequence Comparison Method Based on Transition
Probability Graph and Information Entropy
-
Published:2022-03
Issue:3
Volume:25
Page:392-400
-
ISSN:1386-2073
-
Container-title:Combinatorial Chemistry & High Throughput Screening
-
language:en
-
Short-container-title:CCHTS
Affiliation:
1. College of Information Science and Engineering Hunan Normal University, Changsha 410081,China
Abstract
Aim and Objective:
Sequence analysis is one of the foundations in bioinformatics. It is widely used to find out the
feature metric hidden in the sequence. Otherwise, the graphical representation of biologic sequence is an important tool for
sequencing analysis. This study is undertaken to find out a new graphical representation of biosequences.
Materials and Methods:
The transition probability is used to describe amino acid combinations of protein sequences. The
combinations are composed of amino acids directly adjacent to each other or separated by multiple amino acids. The transition
probability graph is built up by the transition probabilities of amino acid combinations. Next, a map is defined as a representation from transition probability graph to transition probability vector by k-order transition probability graph. Transition
entropy vectors are developed by the transition probability vector and information entropy. Finally, the proposed method is
applied to two separate applications, 499 HA genes of H1N1, and 95 coronaviruses.
Results:
By constructing a phylogenetic tree, we find that the results of each application are consistent with other studies.
Conclusion:
The graphical representation proposed in this article is a practical and correct method.
Funder
Humanities and Social Sciences Research of Ministry of Education of China
Hunan Provincial Science and Technology Project Foundation
Publisher
Bentham Science Publishers Ltd.
Subject
Organic Chemistry,Computer Science Applications,Drug Discovery,General Medicine
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Protein Sequence Comparison Method Based on 3-ary Huffman Coding;Match Communications in Mathematical and in Computer Chemistry;2023-04