Abstract
AbstractThe advent of long-read single-cell transcriptome sequencing (lr-scRNA-Seq) represents a significant leap forward in single-cell genomics. With the recent introduction of R10 flowcells by Oxford Nanopore, we propose that previous computational methods designed to handle high sequencing error rates are no longer relevant, and that the prevailing approach using short reads to compile “barcode space” (candidate barcode list) to de-multiplex long reads are no longer necessary. Instead, computational methods should now shift focus on harnessing the unique benefits of long reads to analyze transcriptome complexity. In this context, we introduce a comprehensive suite of computational methods named Single-Cell Omics for Transcriptome CHaracterization (SCOTCH). Our method is compatible with the single-cell library preparation platform from both 10X Genomics and Parse Biosciences, facilitating the analysis of special cell populations, such as neurons, hepatocytes and developing cardiomyocytes. We specifically re-formulated the transcript mapping problem with a compatibility matrix and addressed the multiple-mapping issue using probabilistic inference, which allows the discovery of novel isoforms as well as the detection of differential isoform usage between cell populations. We evaluated SCOTCH through analysis of real data across different combinations of single-cell libraries and sequencing technologies (10X + Illumina, Parse + Illumina, 10X + Nanopore_R9, 10X + Nanopore_R10, Parse + Nanopore_R10), and showed its ability to infer novel biological insights on cell type-specific isoform expression. These datasets enhance the availability of publicly available data for continued development of computational approaches. In summary, SCOTCH allows extraction of more biological insights from the new advancements in single-cell library construction and sequencing technologies, facilitating the examination of transcriptome complexity at the single-cell level.
Publisher
Cold Spring Harbor Laboratory