Evaluation of methods to assign cell type labels to cell clusters from single-cell RNA-sequencing data-Reference-Cited by-同舟云学术

Evaluation of methods to assign cell type labels to cell clusters from single-cell RNA-sequencing data

Published:2019-03-15 Issue: Volume:8 Page:296
ISSN:2046-1402
Container-title:F1000Research
language:en
Short-container-title:F1000Res

Author:

Diaz-Mejia J. Javier^ORCID,Meng Elaine C.,Pico Alexander R.^ORCID,MacParland Sonya A.^ORCID,Ketela Troy,Pugh Trevor J.,Bader Gary D.^ORCID,Morris John H.^ORCID

Abstract

Background: Identification of cell type subpopulations from complex cell mixtures using single-cell RNA-sequencing (scRNA-seq) data includes automated computational steps like data normalization, dimensionality reduction and cell clustering. However, assigning cell type labels to cell clusters is still conducted manually by most researchers, resulting in limited documentation, low reproducibility and uncontrolled vocabularies. Two bottlenecks to automating this task are the scarcity of reference cell type gene expression signatures and the fact that some dedicated methods are available only as web servers with limited cell type gene expression signatures. Methods: In this study, we benchmarked four methods (CIBERSORT, GSEA, GSVA, and ORA) for the task of assigning cell type labels to cell clusters from scRNA-seq data. We used scRNA-seq datasets from liver, peripheral blood mononuclear cells and retinal neurons for which reference cell type gene expression signatures were available. Results: Our results show that, in general, all four methods show a high performance in the task as evaluated by receiver operating characteristic curve analysis (average area under the curve (AUC) = 0.94, sd = 0.036), whereas precision-recall curve analyses show a wide variation depending on the method and dataset (average AUC = 0.53, sd = 0.24). Conclusions: CIBERSORT and GSVA were the top two performers. Additionally, GSVA was the fastest of the four methods and was more robust in cell type gene expression signature subsampling simulations. We provide an extensible framework to evaluate other methods and datasets at https://github.com/jdime/scRNAseq_cell_cluster_labeling.

Funder

Chan Zuckerberg Initiative

National Resource for Network Biology

Publisher

F1000 Research Ltd

Subject

General Pharmacology, Toxicology and Pharmaceutics,General Immunology and Microbiology,General Biochemistry, Genetics and Molecular Biology,General Medicine

Link

https://f1000research.com/articles/8-296/v1/pdf

Reference25 articles.

1. A web server for comparative analysis of single-cell RNA-seq data.;A Alavi;Nat Commun.,2018

2. scPred: scPred: Cell type prediction at single-cell resolution.;J Alquicira-Hernandez;bioRxiv.,2018

3. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.;M Ashburner;Nat Genet.,2000

4. Cell type discovery and representation in the era of high-content single cell phenotyping.;T Bakken;BMC Bioinformatics.,2017

5. An ontology for cell types.;J Bard;Genome Biol.,2005

Cited by 47 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Adversarial learning enables unbiased organism-wide cross-species alignment of single-cell RNA data at scale;2024-08-11

2. Predicting cell types with supervised contrastive learning on cells and their types;Scientific Reports;2024-01-03

3. Challenges and opportunities to computationally deconvolve heterogeneous tissue with varying cell sizes using single-cell RNA-sequencing datasets;Genome Biology;2023-12-14

4. GoM DE: interpreting structure in sequence count data with differential expression analysis allowing for grades of membership;Genome Biology;2023-10-19

5. Alzheimer's disease‐induced phagocytic microglia express a specific profile of coding and non‐coding RNAs;Alzheimer's & Dementia;2023-10-12