Benchmarking and new generative methods for single-cell transcriptome data in bulk RNA sequence deconvolution-Reference-Cited by-同舟云学术

Benchmarking and new generative methods for single-cell transcriptome data in bulk RNA sequence deconvolution

Published:2023-09-18 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Nishikawa Toui¹,lee Masatoshi¹,Amau Masataka²

Affiliation:

1. Wakayama Medical University

2. Kyoto University

Abstract

Abstract Numerous methods for bulk RNA sequence deconvolution have been developed to identify cellular targets of diseases by understanding the composition of cell types in disease-related tissues. However, issues of heterogeneity in gene expression between subjects and the shortage of reference single-cell RNA sequence (scRNAseq) data remain to achieve accurate bulk deconvolution. In our study, we investigated whether a new data generative method named sc-CMGAN and benchmarking generative methods (Copula, CTGAN and TVAE) could solve these issues and improve the accuracy of bulk deconvolutions. We also evaluated the robustness of sc-CMGAN using three deconvolution methods and four public datasets. In almost all conditions, the generative methods contributed to improved deconvolution accuracy. Notably, sc-CMGAN outperformed the benchmarking methods and demonstrated higher robustness. This study is the first to examine the impact of data augmentation on bulk deconvolution. The new generative method, sc-CMGAN, is expected to become the gold standard for the preprocessing of bulk deconvolution.

Publisher

Research Square Platform LLC

Reference23 articles.

1. The immune contexture in human tumours: impact on clinical outcome;Fridman WH;Nature Reviews Cancer,2012

2. Cellular composition of the human diabetic pancreas;Rahier J;Diabetologia,1983

3. Computational and analytical challenges in single-cell transcriptomics;Stegle O;Nat Rev Genet,2015

4. Comparative Analysis of Single-Cell RNA Sequencing Methods;Ziegenhain C;Mol Cell,2017

5. Separation of samples into their constituents using gene expression data;Venet D;Bioinformatics,2001