Abstract
ABSTRACTminimap2is the gold-standard software for reference-based sequence mapping in third-generation long-read sequencing. Whileminimap2is relatively fast, further speedup is desirable, especially when processing a multitude of large datasets. In this work, we presentminimap2-fpga, a hardware-accelerated version ofminimap2that speeds up the mapping process by integrating an FPGA kernel optimised for chaining. We demonstrate speed-ups in end-to-end run-time for data from both Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PacBio).minimap2-fpgais up to 79% and 53% faster thanminimap2for ∼ 30× ONT and ∼ 50× PacBio datasets respectively, when mapping without base-level alignment. When mapping with base-level alignment,minimap2-fpgais up to 62% and 10% faster thanminimap2for ∼ 30× ONT and ∼ 50× PacBio datasets respectively. The accuracy is near-identical to that of originalminimap2for both ONT and PacBio data, when mapping both with and without base-level alignment.minimap2-fpgais supported on Intel FPGA-based systems (evaluations performed on an on-premise system) and Xilinx FPGA-based systems (evaluations performed on a cloud system). We also provide a well-documented library for the FPGA-accelerated chaining kernel to be used by future researchers developing sequence alignment software with limited hardware background.
Publisher
Cold Spring Harbor Laboratory