Discovering regulatory motifs of genetic networks using the indexing-tree based algorithm: a parallel implementation

Author:

Almomany Abedalmuhdi,Al-Omari Ahmad M.,Jarrah Amin,Tawalbeh Mohammad

Abstract

Purpose The problem of motif discovery has become a significant challenge in the era of big data where there are hundreds of genomes requiring annotations. The importance of motifs has led many researchers to develop different tools and algorithms for finding them. The purpose of this paper is to propose a new algorithm to increase the speed and accuracy of the motif discovering process, which is the main drawback of motif discovery algorithms. Design/methodology/approach All motifs are sorted in a tree-based indexing structure where each motif is created from a combination of nucleotides: ‘A’, ‘C’, ‘T’ and ‘G’. The full motif can be discovered by extending the search around 4-mer nucleotides in both directions, left and right. Resultant motifs would be identical or degenerated with various lengths. Findings The developed implementation discovers conserved string motifs in DNA without having prior information about the motifs. Even for a large data set that contains millions of nucleotides and thousands of very long sequences, the entire process is completed in a few seconds. Originality/value Experimental results demonstrate the efficiency of the proposed implementation; as for a real-sequence of 1,270,000 nucleotides spread into 2,000 samples, it takes 5.9 s to complete the overall discovering process when the code ran on an Intel Core i7-6700 @ 3.4 GHz machine and 26.7 s when running on an Intel Xeon x5670 @ 2.93 GHz machine. In addition, the authors have improved computational performance by parallelizing the implementation to run on multi-core machines using the OpenMP framework. The speedup achieved by parallelizing the implementation is scalable and proportional to the number of processors with a high efficiency that is close to 100%.

Publisher

Emerald

Subject

Computational Theory and Mathematics,Computer Science Applications,General Engineering,Software

Reference57 articles.

1. Scalable multi-core implementation for motif finding problem,2014

2. Parallelizing exact motif finding algorithms on multi-core;The Journal of Supercomputing,2014

3. Finding regulatory motifs of genetic networks using cut-sort algorithm;Jordan Journal of Electrical Engineering,2019

4. The application of Hadoop in structural bioinformatics;Briefings in Bioinformatics,2020

5. Solving large nonlinear systems of first-order ordinary differential equations with hierarchical structure using multi-GPGPUs and an adaptive Runge Kutta ODE solver;Ieee Access,2013

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3