Abstract
AbstractMotivationPangenomics is a growing field within computational genomics. Many pangenomic analyses use bidirected sequence graphs as their core data model. However, implementing and correctly using this data model can be difficult, and the scale of pangenomic data sets can be challenging to work at. These challenges have impeded progress in this field.ResultsHere we present a stack of two C++ libraries, libbdsg and libhandlegraph, which use a simple, field-proven interface, designed to expose elementary features of these graphs while preventing common graph manipulation mistakes. The libraries also provide a Python binding. Using a diverse collection of pangenome graphs, we demonstrate that these tools allow for efficient construction and manipulation of large genome graphs with dense variation. For instance, the speed and memory usage is up to an order of magnitude better than the prior graph implementation in the vg toolkit, which has now transitioned to using libbdsg’s implementations.Availabilitylibhandlegraph and libbdsg are available under an MIT License from https://github.com/vgteam/libhandlegraph and https://github.com/vgteam/libbdsg.Contacterik.garrison@ucsc.edu
Publisher
Cold Spring Harbor Laboratory
Reference12 articles.
1. Multi-platform discovery of haplotype-resolved structural variation in human genomes;Nature Communications,2019
2. Computational pangenomics: status, promises and challenges;Computational pan-genomics consortium;Briefings in Bioinformatics,2016
3. Crysnanto, D. and Pausch, H. (2019). Sequence read mapping and variant discovery from bovine breed-specific augmented reference graphs. bioRxiv.
4. A graph-based approach to diploid genome assembly
5. Garrison, E. (2019). Graphical pangenomics. Ph.D. thesis, University of Cambridge.
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献