Abstract
Joint analysis of multiple single cell RNA-sequencing (scRNA-seq) data is confounded by technical batch effects across experiments, biological or environmental variability across cells, and different capture processes across sequencing platforms. Manifold alignment is a principled, effective tool for integrating multiple data sets and controlling for confounding factors. We demonstrate that the semi-supervised t-distributed Gaussian process latent variable model (sstGPLVM), which projects the data onto a mixture of fixed and latent dimensions, can learn a unified low-dimensional embedding for multiple single cell experiments with minimal assumptions. We show the efficacy of the model as compared with state-of-the-art methods for single cell data integration on simulated data, pancreas cells from four sequencing technologies, induced pluripotent stem cells from male and female donors, and mouse brain cells from both spatial seqFISH+ and traditional scRNA-seq.Code and data is available at https://github.com/architverma1/sc-manifold-alignment
Publisher
Cold Spring Harbor Laboratory
Reference32 articles.
1. Massively parallel digital transcriptional profiling of single cells;Nature Communications,2017
2. The human cell atlas white paper;arXiv preprint,2018
3. Spatial and temporal tools for building a human cell atlas;Molecular Biology of the Cell,2019
4. Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris
5. Single-cell rna sequencing technologies and bioinformatics pipelines;Experimental & Molecular Medicine,2018
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献