Abstract
AbstractMotivationThe Gamma-Poisson distribution is a theoretically and empirically motivated model for the sampling variability of single cell RNA-sequencing counts (Grün et al., 2014; Townes et al., 2019; Svensson, 2020; Silverman et al., 2018; Hafemeister and Satija, 2019) and an essential building block for analysis approaches including differential expression analysis (Robinson et al., 2010; McCarthy et al., 2012; Anders and Huber, 2010; Love et al., 2014), principal component analysis (Townes et al., 2019) and factor analysis (Risso et al., 2018). Existing implementations for inferring its parameters from data often struggle with the size of single cell datasets, which typically comprise thousands or millions of cells; at the same time, they do not take full advantage of the fact that zero and other small numbers are frequent in the data. These limitations have hampered uptake of the model, leaving room for statistically inferior approaches such as logarithm(-like) transformation.ResultsWe present a new R package for fitting the Gamma-Poisson distribution to data with the characteristics of modern single cell datasets more quickly and more accurately than existing methods. The software can work with data on disk without having to load them into RAM simultaneously.AvailabilityThe package glmGamPoi is available from Bioconductor (since release 3.11) for Windows, macOS, and Linux, and source code is available on GitHub under a GPL-3 license. The scripts to reproduce the results of this paper are available on GitHub as well.Contactconstantin.ahlmann@embl.de
Publisher
Cold Spring Harbor Laboratory
Reference19 articles.
1. 10X Genomics (2017a). Data from the 10X 1.3 Million Brain Cell Study.
2. 10X Genomics (2017b). Data from the 10X Genomics on Peripheral Blood Mononuclear Cells.
3. Anders, S. and Huber, W. (2010). Differential expression analysis for sequence count data. Genome Biology.
4. Crowell, H. L. , Soneson, C. , Germain, P.-l. , Calini, D. , Collin, L. , Raposo, C. , Malhotra, D. , and Robinson, M. D. (2019). On the discovery of population-specific state transitions from multi-sample multi-condition single-cell RNA sequencing data. bioRxiv, pages 1–24.
5. Validation of noise models for single-cell transcriptomics;Nature Methods,2014
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献