Abstract
AbstractA main limitation of bulk transcriptomic technologies is that individual measurements normally contain contributions from multiple cell populations, impeding the identification of cellular heterogeneity within diseased tissues. To extract cellular insights from existing large cohorts of bulk transcriptomic data, we present CSsingle, a novel method designed to accurately deconvolve bulk data into a predefined set of cell types using a scRNA-seq reference. Through comprehensive benchmark evaluations and analyses using diverse real data sets, we reveal the systematic bias inherent in existing methods, stemming from differences in cell size or library size. Our extensive experiments demonstrate that CSsingle exhibits superior accuracy and robustness compared to leading methods, particularly when dealing with bulk mixtures originating from cell types of markedly different cell sizes, as well as when handling bulk and single-cell reference data obtained from diverse sources. Our work provides an efficient and robust methodology for the integrated analysis of bulk and scRNA-seq data, facilitating various biological and clinical studies.
Publisher
Cold Spring Harbor Laboratory