Vaeda computationally annotates doublets in single-cell RNA sequencing data-Reference-Cited by-同舟云学术

Vaeda computationally annotates doublets in single-cell RNA sequencing data

Published:2022-11-07 Issue:1 Volume:39 Page:
ISSN:1367-4811
Container-title:Bioinformatics
language:en
Short-container-title:

Author:

Schriever Hannah¹²,Kostka Dennis¹³^ORCID

Affiliation:

1. Department of Developmental Biology, University of Pittsburgh , Pittsburgh, PA 15201, USA

2. Canegie Mellon—University of Pittsburgh Joint PhD Program, University of Pittsburgh , Pittsburgh, PA 15201, USA

3. Department of Computational & Systems Biology and Center for Evolutionary Biology and Medicine, University of Pittsburgh , Pittsburgh, PA 15201, USA

Abstract

Abstract Motivation Single-cell RNA sequencing (scRNA-seq) continues to expand our knowledge by facilitating the study of transcriptional heterogeneity at the level of single cells. Despite this technology’s utility and success in biomedical research, technical artifacts are present in scRNA-seq data. Doublets/multiplets are a type of artifact that occurs when two or more cells are tagged by the same barcode, and therefore they appear as a single cell. Because this introduces non-existent transcriptional profiles, doublets can bias and mislead downstream analysis. To address this limitation, computational methods to annotate and remove doublets form scRNA-seq datasets are needed. Results We introduce vaeda (Variational Auto-Encoder for Doublet Annotation), a new approach for computational annotation of doublets in scRNA-seq data. Vaeda integrates a variational auto-encoder and Positive-Unlabeled learning to produce doublet scores and binary doublet calls. We apply vaeda, along with seven existing doublet annotation methods, to 16 benchmark datasets and find that vaeda performs competitively in terms of doublet scores and doublet calls. Notably, vaeda outperforms other python-based methods for doublet annotation. Altogether, vaeda is a robust and competitive method for scRNA-seq doublet annotation and may be of particular interest in the context of python-based workflows. Availability and implementation Vaeda is available at https://github.com/kostkalab/vaeda, and the version used for the results we present here is archived at zenodo (https://doi.org/10.5281/zenodo.7199783). Supplementary information Supplementary data are available at Bioinformatics online.

Funder

University of Pittsburgh School of Medicine

National Institute of Heath

NIH

National Institute of Biomedical Imaging and Bioengineering

NIBIB

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Link

https://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btac720/47104587/btac720.pdf

Reference20 articles.

1. scds: computational annotation of doublets in single-cell RNA sequencing data;Bais;Bioinformatics,2020

2. Solo: doublet identification in single-cell RNA-seq via semi-supervised deep learning;Bernstein;Cell Syst,2020

3. Doublet identification in single-cell sequencing data using scDblFinder;Germain;F1000Research,2021

4. mbkmeans: fast clustering for single cell data using mini-batch k-means;Hicks;PLoS Comput. Biol,2021

5. Multiplexed droplet single-cell RNA-sequencing using natural genetic variation;Kang;Nat. Biotechnol,2018

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A unified model-based framework for doublet or multiplet detection in single-cell multiomics data;Nature Communications;2024-07-02

2. Robust and Accurate Doublet Detection of Single-Cell Sequencing Data via Maximizing Area Under Precision-Recall Curve;2023-11-02