Exponential family measurement error models for single-cell CRISPR screens

Author:

Barry Timothy1,Roeder Kathryn2,Katsevich Eugene3

Affiliation:

1. Department of Biostatistics, Harvard T.H. Chan School of Public Health , Building 2 435, 655 Huntington Ave , Boston, MA 02115, United States

2. Department of Statistics and Data Science, Carnegie Mellon University , Baker Hall 228B, 4909 Frew St , Pittsburgh, PA 15213, United States

3. Department of Statistics and Data Science, University of Pennsylvania , Academic Research Building 311, 265 South 37th Street Philadelphia , PA 19104, United States

Abstract

Summary CRISPR genome engineering and single-cell RNA sequencing have accelerated biological discovery. Single-cell CRISPR screens unite these two technologies, linking genetic perturbations in individual cells to changes in gene expression and illuminating regulatory networks underlying diseases. Despite their promise, single-cell CRISPR screens present considerable statistical challenges. We demonstrate through theoretical and real data analyses that a standard method for estimation and inference in single-cell CRISPR screens—“thresholded regression”—exhibits attenuation bias and a bias-variance tradeoff as a function of an intrinsic, challenging-to-select tuning parameter. To overcome these difficulties, we introduce GLM-EIV (“GLM-based errors-in-variables”), a new method for single-cell CRISPR screen analysis. GLM-EIV extends the classical errors-in-variables model to responses and noisy predictors that are exponential family-distributed and potentially impacted by the same set of confounding variables. We develop a computational infrastructure to deploy GLM-EIV across hundreds of processors on clouds (e.g. Microsoft Azure) and high-performance clusters. Leveraging this infrastructure, we apply GLM-EIV to analyze two recent, large-scale, single-cell CRISPR screen datasets, yielding several new insights.

Funder

National Institute of Mental Health

Publisher

Oxford University Press (OUP)

Reference32 articles.

1. Regression with a binary independent variable subject to errors of observation;Aigner;J Econ,1973

2. SCEPTRE improves calibration and sensitivity in single-cell CRISPR screen analysis;Barry;Genome Biol.,2021

3. Panning for gold: ‘model-X’ knockoffs for high dimensional controlled variable selection;Candès;J R Stat Soc Ser B,2018

4. Measurement Error in Nonlinear Models

5. Comparison and evaluation of statistical error models for scRNA-seq;Choudhary;Genome Biol.,2022

Cited by 3 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3