Abstract
ABSTRACTDroplet-based single-cell RNA sequencing (scRNA-seq) has significantly increased the number of cells profiled per experiment and revolutionized the study of individual transcriptomes. However, to maximize the biological signal robust computational methods are needed to distinguish cell-free from cell-containing droplets. Here, we introduce a novel cell-calling algorithm called EmptyNN, which trains a neural network based on positive-unlabeled learning for improved filtering of barcodes. We leveraged cell hashing and genetic variation to provide ground-truth. EmptyNN accurately removed cell-free droplets while recovering lost cell clusters, and achieved an Area Under the Receiver Operating Characteristics (AUROC) of 94.73% and 96.30%, respectively. The comparisons to current state-of-the-art cell-calling algorithms demonstrated the superior performance of EmptyNN, as measured by the number of recovered cell-containing droplets and cell types. EmptyNN was further applied to two additional datasets and showed good performance. Therefore, EmptyNN represents a powerful tool to enhance scRNA-seq quality control analyses.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献