Abstract
Motivation: Accurate deconvolution of cell types from bulk gene ex- pression is crucial for understanding cellular compositions and uncovering cell-type specific differential expression and physiological states of diseased tissues. Existing deconvolution methods have limitations, such as requiring complete cellular gene expression signatures or neglecting partial biological information. Moreover, these methods often overlook varying cell-type mRNA amounts, leading to biased proportion estimates. Additionally, they do not effectively utilize valuable reference information from external studies, such as means and ranges of population cell-type proportions. Results: To address these challenges, we introduce an Adaptive Regular- ized Tri-factor non-negative matrix factorization approach for deconvolution (ARTdeConv). We rigorously establish the numerical convergence of our algorithm. Through benchmark simulations, we demonstrate the superior per- formance of ARTdeConv compared to state-of-the-art reference-free methods. In a real-world application, our method accurately estimates cell proportions, as evidenced by the nearly perfect Pearson's correlation between ARTdeConv estimates and flow cytometry measurements in a dataset from a trivalent influenza vaccine study. Moreover, our analysis of ARTdeConv estimates in COVID-19 patients reveals patterns consistent with important immunological phenomena observed in other studies. Availability and implementation: The proposed method, ARTdeConv, is implemented as an R package and can be accessed on GitHub for researchers and practitioners at https://github.com/gr8lawrence/ARTDeConv. Keywords: Cell-type deconvolution, Convergence analysis, Multiplicative update algorithm, Non-negative matrix factorization, RNA sequencing, Single cell data
Publisher
Cold Spring Harbor Laboratory