Affiliation:
1. Unaffiliated, Corte Madera, CA, United States of America
Abstract
Mapping of reads to reference sequences is an essential step in a wide range of biological studies. The large size of datasets generated with next-generation sequencing technologies motivates the development of fast mapping software. Here, I describe URMAP, a new read mapping algorithm. URMAP is an order of magnitude faster than BWA with comparable accuracy on several validation tests. On a Genome in a Bottle (GIAB) variant calling test with 30× coverage 2×150 reads, URMAP achieves high accuracy (precision 0.998, sensitivity 0.982 and F-measure 0.990) with the strelka2 caller. However, GIAB reference variants are shown to be biased against repetitive regions which are difficult to map and may therefore pose an unrealistically easy challenge to read mappers and variant callers.
Subject
General Agricultural and Biological Sciences,General Biochemistry, Genetics and Molecular Biology,General Medicine,General Neuroscience
Reference33 articles.
1. Basic local alignment search tool;Altschul;Journal of Molecular Biology,1990
2. A map of human genome variation from population-scale sequencing;Altshuler;Nature,2010
3. GenBank;Benson;Nucleic Acids Research,2013
4. A block-sorting lossless data compression algorithm. Technical report 124, Palo Alto, CA, Digital Equipment Corporation;Burrows,1994
5. Modernizing reference genome assemblies;Church;PLOS Biology,2011
Cited by
13 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献