A graph-theoretical approach to DNA similarity analysis-Reference-Cited by-同舟云学术

A graph-theoretical approach to DNA similarity analysis

Published:2021-08-06 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Nguyen Dong Quan Ngoc,Xing Lin,Le Phuong Dong Tan,Lin Lizhen

Abstract

AbstractOne of the very active research areas in bioinformatics is DNA similarity analysis. There are several approaches using alignment-based or alignment-free methods to analyze similarities/dissimilarities between DNA sequences. In this work, we introduce a novel representation of DNA sequences, using n-ary Cartesian products of graphs for arbitrary positive integers n. Each of the component graphs in the representing Cartesian product of each DNA sequence contain combinatorial information of certain tuples of nucleotides appearing in the DNA sequence. We further introduce a metric space structure to the set of all Cartesian products of graphs that represent a given collection of DNA sequences in order to be able to compare different Cartesian products of graphs, which in turn signifies similarities/dissimilarities between DNA sequences. We test our proposed method on several datasets including Human Papillomavirus, Human rhinovirus, Influenza A virus, and Mammals. We compare our method to other methods in literature, which indicates that our analysis results are comparable in terms of time complexity and high accuracy, and in one dataset, our method performs the best in comparison with other methods.

Publisher

Cold Spring Harbor Laboratory

Reference32 articles.

1. A novel dna sequence similarity calculation based on simplified pulse-coupled neural network and huffman coding;Physica A: Statistical Mechanics and its Applications,2016

2. Analysis of similarities/dissimilarities of dna sequences based on a novel graphical representation;MATCH Commun. Math. Comput. Chem,2010

4. C-curve: a novel 3d graphical representation of dna sequence based on codons;Mathematical Biosciences,2013

5. Analysis of similarity/dissimilarity of dna sequences based on a condensed curve representation;Journal of Molecular Structure: THEOCHEM,2005