scShapes: a statistical framework for identifying distribution shapes in single-cell RNA-sequencing data

Author:

Dharmaratne Malindrie1ORCID,Kulkarni Ameya S23ORCID,Taherian Fard Atefeh1ORCID,Mar Jessica C1ORCID

Affiliation:

1. Australian Institute for Bioengineering and Nanotechnology, The University of Queensland , Brisbane, QLD, 4072 , Australia

2. Institute for Aging Research, Albert Einstein College of Medicine , Bronx, New York, NY 10461, USA

3. Department of Medicine, Division of Endocrinology, Albert Einstein College of Medicine , Bronx, New York, NY 10461, USA

Abstract

Abstract Background Single-cell RNA sequencing (scRNA-seq) methods have been advantageous for quantifying cell-to-cell variation by profiling the transcriptomes of individual cells. For scRNA-seq data, variability in gene expression reflects the degree of variation in gene expression from one cell to another. Analyses that focus on cell–cell variability therefore are useful for going beyond changes based on average expression and, instead, identifying genes with homogeneous expression versus those that vary widely from cell to cell. Results We present a novel statistical framework, scShapes, for identifying differential distributions in single-cell RNA-sequencing data using generalized linear models. Most approaches for differential gene expression detect shifts in the mean value. However, as single-cell data are driven by overdispersion and dropouts, moving beyond means and using distributions that can handle excess zeros is critical. scShapes quantifies gene-specific cell-to-cell variability by testing for differences in the expression distribution while flexibly adjusting for covariates if required. We demonstrate that scShapes identifies subtle variations that are independent of altered mean expression and detects biologically relevant genes that were not discovered through standard approaches. Conclusions This analysis also draws attention to genes that switch distribution shapes from a unimodal distribution to a zero-inflated distribution and raises open questions about the plausible biological mechanisms that may give rise to this, such as transcriptional bursting. Overall, the results from scShapes help to expand our understanding of the role that gene expression plays in the transcriptional regulation of a specific perturbation or cellular phenotype. Our framework scShapes is incorporated into a Bioconductor R package (https://www.bioconductor.org/packages/release/bioc/html/scShapes.html).

Funder

Australian Research Council

Australasian Genomic Technologies Association

Publisher

Oxford University Press (OUP)

Subject

Computer Science Applications,Health Informatics

Reference66 articles.

1. Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells;Buettner;Nat Biotechnol,2015

2. The rise of the distributions: why non-normality is important for understanding the transcriptome and beyond;Mar;Biophys Rev,2019

3. Single Cell RNA Sequencing of Rare Immune Cell Populations;Nguyen;Front Immunol,2018

4. Gene regulatory network reconstruction using single-cell RNA sequencing of barcoded genotypes in diverse environments;Jackson;Elife,2020

5. Single-cell RNA-sequencing of differentiating iPS cells reveals dynamic genetic effects on gene expression;Cuomo;Nat Commun,2020

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3