Abstract
AbstractRecent advances in long-read sequencing technologies enabled accuratede novoassembly of large genomes and metagenomes. However, even long and accurate high-fidelity (HiFi) reads do not resolve repeats that are longer than the read lengths. This limitation significantly affects the contiguity of diploid genome assemblies since two haplomes share many long identical regions. To generate telomere-to-telomere assemblies of diploid genomes, biologists use additional experimental technologies, such as linked reads or ultralong Oxford Nanopore reads. In particular, the barcoded linked-reads, generated using an inexpensive TELL-Seq technology, provide an attractive way to bridge unresolved repeats in phased assemblies of diploid genomes.Here, we present a SpLitteR tool for haplotype phasing and scaffolding in an assembly graph using barcoded linked-reads. We benchmark SpLitteR on assembly graphs produced by various long-read assemblers and TELL-Seq reads and show that it improves upon the state-of-the-art linked-read scaffolders in the accuracy and contiguity metrics.
Publisher
Cold Spring Harbor Laboratory