GenMap: ultra-fast computation of genome mappability

Author:

Pockrandt Christopher1234,Alzamel Mai56,Iliopoulos Costas S5,Reinert Knut34

Affiliation:

1. Center for Computational Biology, School of Medicine

2. Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA

3. Department of Computer Science and Mathematics, Freie Universität Berlin

4. Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany

5. Department of Informatics, King’s College London, London, UK

6. Department of Computer Science, King Saud University, Riyadh, Saudi Arabia

Abstract

Abstract Motivation Computing the uniqueness of k-mers for each position of a genome while allowing for up to e mismatches is computationally challenging. However, it is crucial for many biological applications such as the design of guide RNA for CRISPR experiments. More formally, the uniqueness or (k, e)-mappability can be described for every position as the reciprocal value of how often this k-mer occurs approximately in the genome, i.e. with up to e mismatches. Results We present a fast method GenMap to compute the (k, e)-mappability. We extend the mappability algorithm, such that it can also be computed across multiple genomes where a k-mer occurrence is only counted once per genome. This allows for the computation of marker sequences or finding candidates for probe design by identifying approximate k-mers that are unique to a genome or that are present in all genomes. GenMap supports different formats such as binary output, wig and bed files as well as csv files to export the location of all approximate k-mers for each genomic position. Availability and implementation GenMap can be installed via bioconda. Binaries and C++ source code are available on https://github.com/cpockrandt/genmap.

Funder

US National Institutes of Health

Royal Society

international exchange schema

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Reference18 articles.

1. Rapid and simple determination of the Escherichia coli phylogenetic group;Clermont;Appl. Environ. Microbiol,2000

2. Fast computation and applications of genome mappability;Derrien;PLoS One,2012

3. Tools for mapping high-throughput sequencing data;Fonseca;Bioinformatics,2012

4. Umap and Bismap: quantifying genome and methylome mappability;Karimzadeh;Nucleic Acids Res,2018

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3