Author:
Guo Xin-Yang,Huang Zhao-Yang,Ju Fen,Zhao Chen-Guang,Yu Liang
Abstract
AbstractAccurately identifying the cellular composition of complex tissues is critical for understanding disease pathogenesis, early diagnosis, and prevention. However, current methods for deconvoluting bulk RNA sequencing (RNA-seq) typically rely on matched single-cell RNA sequencing (scRNA-seq) as a reference, which can be limiting due to differences in sequencing distribution and the potential for invalid information from single-cell references. To overcome these limitations, we introduced SCROAM, a novel computational method that overcomes these challenges. SCROAM transforms scRNA-seq and bulk RNA-seq into a shared feature space, effectively eliminating distributional differences in the latent space. We then generate cell-type-specific expression matrices from scRNA-seq, enabling accurate identification of cell types in bulk tissues. We evaluated the performance of SCROAM by benchmarking it against simulated datasets and human breast cancer and peripheral blood datasets, demonstrating its accuracy and robustness. To further validate SCROAM’s performance, we conducted single-cell and bulk RNA-seq experiments on mouse spinal cord tissue and applied SCROAM to identify bulk tissue cell types. Our results indicate that SCROAM is a highly effective tool for identifying similar cell types, surpassing the performance of existing methods. We then performed an integrated analysis of liver cancer and primary glioblastoma to investigate the relationship between cell type composition and clinical outcomes in various tumor types, highlighting the significance of SCROAM for understanding cellular heterogeneity in complex diseases. Overall, our work presents a novel perspective to accurately infer cellular composition and expression in bulk RNA-seq, offering valuable insights into disease pathogenesis and potential therapeutic strategies.
Publisher
Cold Spring Harbor Laboratory