ADEPT: a domain independent sequence alignment strategy for gpu architectures

Author:

Awan Muaaz G.,Deslippe Jack,Buluc Aydin,Selvitopi Oguz,Hofmeyr Steven,Oliker Leonid,Yelick Katherine

Abstract

Abstract Background Bioinformatic workflows frequently make use of automated genome assembly and protein clustering tools. At the core of most of these tools, a significant portion of execution time is spent in determining optimal local alignment between two sequences. This task is performed with the Smith-Waterman algorithm, which is a dynamic programming based method. With the advent of modern sequencing technologies and increasing size of both genome and protein databases, a need for faster Smith-Waterman implementations has emerged. Multiple SIMD strategies for the Smith-Waterman algorithm are available for CPUs. However, with the move of HPC facilities towards accelerator based architectures, a need for an efficient GPU accelerated strategy has emerged. Existing GPU based strategies have either been optimized for a specific type of characters (Nucleotides or Amino Acids) or for only a handful of application use-cases. Results In this paper, we present ADEPT, a new sequence alignment strategy for GPU architectures that is domain independent, supporting alignment of sequences from both genomes and proteins. Our proposed strategy uses GPU specific optimizations that do not rely on the nature of sequence. We demonstrate the feasibility of this strategy by implementing the Smith-Waterman algorithm and comparing it to similar CPU strategies as well as the fastest known GPU methods for each domain. ADEPT’s driver enables it to scale across multiple GPUs and allows easy integration into software pipelines which utilize large scale computational systems. We have shown that the ADEPT based Smith-Waterman algorithm demonstrates a peak performance of 360 GCUPS and 497 GCUPs for protein based and DNA based datasets respectively on a single GPU node (8 GPUs) of the Cori Supercomputer. Overall ADEPT shows 10x faster performance in a node-to-node comparison against a corresponding SIMD CPU implementation. Conclusions ADEPT demonstrates a performance that is either comparable or better than existing GPU strategies. We demonstrated the efficacy of ADEPT in supporting existing bionformatics software pipelines by integrating ADEPT in MetaHipMer a high-performance denovo metagenome assembler and PASTIS a high-performance protein similarity graph construction pipeline. Our results show 10% and 30% boost of performance in MetaHipMer and PASTIS respectively.

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology

Reference42 articles.

1. Georganas E, Buluç A, Chapman J, Oliker L, Rokhsar D, Yelick K. meraligner: A fully parallel sequence aligner. In: 2015 IEEE International Parallel and Distributed Processing Symposium. Hyderabad International Convention Centre, Hyderabad: IEEE: 2015. p. 561–70.

2. Georganas E, Buluç A, Chapman J, Hofmeyr S, Aluru C, Egan R, Oliker L, Rokhsar D, Yelick K. Hipmer: an extreme-scale de novo genome assembler. In: SC’15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. Austin: IEEE: 2015. p. 1–11.

3. Ellis M, Guidi G, Buluç A, Oliker L, Yelick K. dibella: Distributed long read to long read alignment. In: Proceedings of the 48th International Conference on Parallel Processing. ACM: 2019. p. 1–11.

4. Ba A, Yeh B, Van Dyk D, Davidson A, Andrews B, Weiss E, Moses A. Proteome-wide discovery of evolutionary conserved sequences in disordered regions. Sci Signal. 2012; 5(215):1–1.

5. Smith T, Waterman M, et al. Identification of common molecular subsequences. J Mol Biol. 1981; 147(1):195–7.

Cited by 29 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. QUETZAL: Vector Acceleration Framework for Modern Genome Sequence Analysis Algorithms;2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA);2024-06-29

2. Parallel and (Nearly) Work-Efficient Dynamic Programming;Proceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures;2024-06-17

3. From GPUs to AI and quantum: three waves of acceleration in bioinformatics;Drug Discovery Today;2024-06

4. Evaluating the potential of disaggregated memory systems for HPC applications;Concurrency and Computation: Practice and Experience;2024-05-31

5. FPGA-based Hardware Software Co-design to Accelerate Brain Tumour Segmentation;2024 IEEE International Symposium on Circuits and Systems (ISCAS);2024-05-19

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3