An efficient method for significant motifs discovery from multiple DNA sequences-Reference-Cited by-同舟云学术

An efficient method for significant motifs discovery from multiple DNA sequences

Published:2017-08 Issue:04 Volume:15 Page:1750014
ISSN:0219-7200
Container-title:Journal of Bioinformatics and Computational Biology
language:en
Short-container-title:J. Bioinform. Comput. Biol.

Author:

Al-Ssulami Abdulrakeeb M.¹,Azmi Aqil M.¹,Mathkour Hassan¹

Affiliation:

1. Department of Computer Science, College of Computer & Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia

Abstract

Identification of transcription factor binding sites or biological motifs is an important step in deciphering the mechanisms of gene regulation. It is a classic problem that has eluded a satisfactory and efficient solution. In this paper, we devise a three-phase algorithm to mine for biologically significant motifs. In the first phase, we generate all the possible string motifs, this phase is followed by a filtering process where we discard all motifs that do not meet the constraints. And in the final phase, motifs are scored and ranked using a combination of stochastic techniques and [Formula: see text]-value. We show that our method outperforms some very well-known motif discovery tools, e.g. MEME and Weeder on well-established benchmark data suites. We also apply the algorithm on the non-coding regions of M. tuberculosis and report significant motifs of size 10 with excellent [Formula: see text]-values in a fraction of the time MEME and MoSDi did. In fact, among the best 10 motifs ([Formula: see text]-value wise) in the non-coding regions of M. tuberculosis reported by the tools MEME, MoSDi and ours, five were discovered by our approach which included the third and the fourth best ones. All this in 1/17 and 1/6 the time which MEME and MoSDi (respectively) took.

Publisher

World Scientific Pub Co Pte Lt

Subject

Computer Science Applications,Molecular Biology,Biochemistry

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0219720017500147

Reference28 articles.

1. Computational identification of transcriptional regulatory elements in DNA sequence

2. Error Detecting and Error Correcting Codes

3. Shuffling biological sequences with motif constraints

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Discovery of network motifs based on induced subgraphs using a dynamic expansion tree;Computational Biology and Chemistry;2021-08

2. Genome-wide identification of 5-methylcytosine sites in bacterial genomes by high-throughput sequencing of MspJI restriction fragments;PLOS ONE;2021-05-11

3. Genome-Wide Identification of 5-Methylcytosine Sites in Bacterial Genomes By High-Throughput Sequencing of MspJI Restriction Fragments;2021-02-10