Comparison and evaluation of statistical error models for scRNA-seq-Reference-Cited by-同舟云学术

Comparison and evaluation of statistical error models for scRNA-seq

Published:2022-01-18 Issue:1 Volume:23 Page:
ISSN:1474-760X
Container-title:Genome Biology
language:en
Short-container-title:Genome Biol

Author:

Choudhary Saket,Satija Rahul^ORCID

Abstract

Abstract Background Heterogeneity in single-cell RNA-seq (scRNA-seq) data is driven by multiple sources, including biological variation in cellular state as well as technical variation introduced during experimental processing. Deconvolving these effects is a key challenge for preprocessing workflows. Recent work has demonstrated the importance and utility of count models for scRNA-seq analysis, but there is a lack of consensus on which statistical distributions and parameter settings are appropriate. Results Here, we analyze 59 scRNA-seq datasets that span a wide range of technologies, systems, and sequencing depths in order to evaluate the performance of different error models. We find that while a Poisson error model appears appropriate for sparse datasets, we observe clear evidence of overdispersion for genes with sufficient sequencing depth in all biological systems, necessitating the use of a negative binomial model. Moreover, we find that the degree of overdispersion varies widely across datasets, systems, and gene abundances, and argues for a data-driven approach for parameter estimation. Conclusions Based on these analyses, we provide a set of recommendations for modeling variation in scRNA-seq data, particularly when using generalized linear models or likelihood-based approaches for preprocessing and downstream analysis.

Funder

chan zuckerberg initiative

national human genome research institute

chan zuckerburg initiative

nih office of the director

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1186/s13059-021-02584-9.pdf

Reference79 articles.

1. Hodge RD, Miller JA, Novotny M, Kalmbach BE, Ting JT, Bakken TE, Aevermann BD, Barkan ER, Berkowitz-Cerasano ML, Cobbs C, Diez-Fuertes F, Ding S-L, McCorrison J, Schork NJ, Shehata SI, Smith KA, Sunkin SM, Tran DN, Venepally P, Yanny AM, Steemers FJ, Phillips JW, Bernard A, Koch C, Lasken RS, Scheuermann RH, Lein ES. Transcriptomic evidence that von economo neurons are regionally specialized extratelencephalic-projecting excitatory neurons. Nat Commun. 2020; 11(1):1172.

2. Colquitt BM, Merullo DP, Konopka G, Roberts TF, Brainard MS. Cellular transcriptomics reveals evolutionary identities of songbird vocal circuits. Science. 2021; 371(6530):9704.