Author:
DeMeo Benjamin,Berger Bonnie
Abstract
AbstractDimensionality reduction is crucial to summarizing the complex transcriptomic landscape of single cell datasets for downstream analyses. However, current dimensionality reduction approaches favor large cellular populations defined by many genes, at the expense of smaller and more subtly-defined populations. Here, we present surprisal component analysis (SCA), a technique that leverages the information-theoretic notion of surprisal for dimensionality reduction, and demonstrate its ability to improve the representation of clinically important populations that are indistinguishable using existing pipelines. For example, in cytotoxic T-cell data, SCA cleanly separates the gamma-delta and MAIT cell subpopulations, which are not detectable via PCA, ICA, scVI, or a wide array of specialized rare cell recovery tools. We also show that, when used instead of PCA, SCA improves downstream imputation to more accurately restore mRNA dropouts and recover important gene-gene relationships. SCA’s information-theoretic paradigm opens the door to more meaningful signal extraction, with broad applications to the study of complex biological tissues in health and disease.
Publisher
Cold Spring Harbor Laboratory
Reference46 articles.
1. Park, J. H. & Lee, H. K. Function of γδ t cells in tumor immunology and their application to cancer therapy. Experimental & Molecular Medicine 1–10 (2021).
2. Parrot, T. et al. Expansion of donor-unrestricted mait cells with enhanced cytolytic function suitable for tcr redirection. JCI insight 6 (2021).
3. Molecular Diversity and Specializations among the Cells of the Adult Mouse Brain
4. Massively parallel digital transcriptional profiling of single cells;Nature communications,2017
5. Hao, Y. et al. Integrated analysis of multimodal single-cell data. bioRxiv (2020).
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献