Ariadne: Barcoded Linked-Read Deconvolution Using de Bruijn Graphs
Author:
Mak Lauren, Meleshko Dmitry, Danko David C., Barakzai Waris N., Belchikov Natan, Hajirasouliha ImanORCID
Abstract
AbstractDe novo assemblies are critical for capturing the genetic composition of complex samples. Linked-read sequencing techniques such as 10x Genomics’ Linked-Reads, UST’s TELL-Seq, Loop Genomics’ LoopSeq, and BGI’s Long Fragment Read combines 3′ barcoding with standard short-read sequencing to expand the range of linkage resolution from hundreds to tens of thousands of base-pairs. The application of linked-read sequencing to genome assembly has demonstrated that barcoding-based technologies balance the tradeoffs between long-range linkage, per-base coverage, and costs. Linked-reads come with their own challenges, chief among them the association of multiple long fragments with the same 3′ barcode. The lack of a unique correspondence between a long fragment and a barcode, in conjunction with low sequencing depth, confounds the assignment of linkage between short-reads.ResultsWe introduce Ariadne, a novel linked-read deconvolution algorithm based on assembly graphs, that can be used to extract single-species read-sets from a large linked-read dataset. Ariadne deconvolution of linked-read clouds increases the proportion of read clouds containing only reads from a single fragment by up to 37.5-fold. Using these enhanced read clouds in de novo assembly significantly improves assembly contiguity and the size of the largest aligned blocks in comparison to the non-deconvolved read clouds. Integrating barcode deconvolution tools, such as Ariadne, into the postprocessing pipeline for linked-read technologies increases the quality of de novo assembly for complex populations, such as microbiomes. Ariadne is intuitive, computationally efficient, and scalable to other large-scale linked-read problems, such as human genome phasing.AvailabilityThe source code is available on GitHub: https://github.com/lauren-mak/Ariadne
Publisher
Cold Spring Harbor Laboratory
Reference29 articles.
1. New approaches for metagenome assembly with short reads;Briefings in Bioinformatics,2019 2. SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing 3. High-quality genome sequences of uncultured microbes by assembly of read clouds;Nature Biotechnology,2018 4. Brown, C.L. , Keenum, I.M. , Dai, D. , Zhang, L. , Vikesland, P.J. , Pruden, A. : Critical evaluation of short, long, and hybrid assembly for contextual analysis of antibiotic resistance genes in complex environmental metagenomes. Scientific Reports 11(1) (2021). https://doi.org/10.1038/s41598-021-83081-8 5. Chen, Z. , Pham, L. , Wu, T.C. , Mo, G. , Xia, Y. , Chang, P. , Porter, D. , Phan, T. , Che, H. , Tran, H. , Bansal, V. , Shaffer, J. , Belda-Ferre, P. , Humphrey, G. , Knight, R. , Pevzner, P. , Pham, S. , Wang, Y. , Lei, M. : Ultra-low input single tube linked-read library method enables short-read ngs systems to generate highly accurate and economical long-range sequencing information for de novo genome assembly and haplotype phasing. bioRxiv p. 852947 (01 2019)
|
|