Affiliation:
1. Stanford University, Stanford, CA, USA
2. Stanford University, NVIDIA Research, Stanford, CA, USA
Abstract
Genomics is transforming medicine and our understanding of life in fundamental ways. Genomics data, however, is far outpacing Moore»s Law. Third-generation sequencing technologies produce 100X longer reads than second generation technologies and reveal a much broader mutation spectrum of disease and evolution. However, these technologies incur prohibitively high computational costs. Over 1,300 CPU hours are required for reference-guided assembly of the human genome, and over 15,600 CPU hours are required for de novo assembly. This paper describes "Darwin" --- a co-processor for genomic sequence alignment that, without sacrificing sensitivity, provides up to $15,000X speedup over the state-of-the-art software for reference-guided assembly of third-generation reads. Darwin achieves this speedup through hardware/algorithm co-design, trading more easily accelerated alignment for less memory-intensive filtering, and by optimizing the memory system for filtering. Darwin combines a hardware-accelerated version of D-SOFT, a novel filtering algorithm, alignment at high speed, and with a hardware-accelerated version of GACT, a novel alignment algorithm. GACT generates near-optimal alignments of arbitrarily long genomic sequences using constant memory for the compute-intensive step. Darwin is adaptable, with tunable speed and sensitivity to match emerging sequencing technologies and to meet the requirements of genomic applications beyond read assembly.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Reference83 articles.
1. Pico computing product brief: M-505-k325t. URL https://goo.gl/poeWUA. Pico computing product brief: M-505-k325t. URL https://goo.gl/poeWUA.
2. TimeLogic Corporation. URL http://www.timelogic.com. TimeLogic Corporation. URL http://www.timelogic.com.
3. Ultraconserved Elements in the Human Genome
Cited by
35 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. GenArchBench: A genomics benchmark suite for arm HPC processors;Future Generation Computer Systems;2024-08
2. MegIS: High-Performance, Energy-Efficient, and Low-Cost Metagenomic Analysis with In-Storage Processing;2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA);2024-06-29
3. QUETZAL: Vector Acceleration Framework for Modern Genome Sequence Analysis Algorithms;2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA);2024-06-29
4. Harp: Leveraging Quasi-Sequential Characteristics to Accelerate Sequence-to-Graph Mapping of Long Reads;Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3;2024-04-27
5. Data Motion Acceleration: Chaining Cross-Domain Multi Accelerators;2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA);2024-03-02