Abstract
AbstractImprovements in nanopore sequencing necessitate efficient classification methods, including pre-filtering and adaptive sampling algorithms that enrich for reads of interest. Signal-based approaches circumvent the computational bottleneck of basecalling. But past methods for signal-based classification do not scale efficiently to large, repetitive references like pangenomes, limiting their utility to partial references or individual genomes. We introduce Sigmoni: a rapid, multiclass classification method based on ther-index that scales to references of hundreds of Gbps. Sigmoni quantizes nanopore signal into a discrete alphabet of picoamp ranges. It performs rapid, approximate matching using matching statistics, classifying reads based on distributions of picoamp matching statistics and co-linearity statistics. Sigmoni is 10-100×faster than previous methods for adaptive sampling in host depletion experiments with improved accuracy, and can query reads against large microbial or human pangenomes.
Publisher
Cold Spring Harbor Laboratory
Reference28 articles.
1. Performance of neural network basecalling tools for Oxford Nanopore sequencing
2. Centrifuge: rapid and sensitive classification of metagenomic sequences
3. “Fast and sensitive taxonomic classification for metagenomics with kaiju;Nature communications,2016
4. O. Ahmed , M. Rossi , S. Kovaka , M. C. Schatz , T. Gagie , C. Boucher , and B. Langmead , “Pan-genomic matching statistics for targeted nanopore sequencing,” iScience, vol. 24, no. 6, p. 102 696, Jun. 2021.
5. “Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED;Nat Biotechnol,2021
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献