scBoolSeq: Linking scRNA-Seq Statistics and Boolean Dynamics

Author:

Maganã López GustavoORCID,Calzone LaurenceORCID,Zinovyev AndreiORCID,Paulevé LoïcORCID

Abstract

AbstractBoolean networks are largely employed to model the qualitative dynamics of cell fate processes by describing the change of binary activation states of genes and transcription factors with time. Being able to bridge such qualitative states with quantitative measurements of gene expressions in cells, as scRNA-Seq, is a cornerstone for data-driven model construction and validation. On one hand, scRNA-Seq binarisation is a key step for inferring and validating Boolean models. On the other hand, the generation of synthetic scRNA-Seq data from baseline Boolean models provides an important asset to benchmark inference methods. However, linking characteristics of scRNA-Seq datasets, including dropout events, with Boolean states is a challenging task.We presentscBoolSeq, a method for the bidirectional linking of scRNA-Seq data and Boolean activation state of genes. Given a reference scRNA-Seq dataset,scBoolSeqcomputes statistical criteria to classify the empirical gene pseudocount distributions as either unimodal, bimodal, or zero-inflated, and fit a probabilistic model of dropouts, with gene-dependent parameters. From these learnt distributions,scBoolSeqcan perform both binarisation of scRNA-Seq datasets, and generate synthetic scRNA-Seq datasets from Boolean trajectories, as issued from Boolean networks, using biased sampling and dropout simulation. We present a case study demonstrating the application ofscBoolSeq’s binarisation scheme in data-driven model inference. Furthermore, we compare synthetic scRNA-Seq data generated byscBoolSeqwith BoolODE from the same Boolean Network model. The comparison shows that our method better reproduces the statistics of real scRNA-Seq datasets, such as the mean-variance and mean-dropout relationships while exhibiting clearly defined trajectories in a two-dimensional projection of the data.Author summaryThe qualitative and logical modeling of cell dynamics has brought precious insight on gene regulatory mechanisms that drive cellular differentiation and fate decisions by predicting cellular trajectories and mutations for their control. However, the design and validation of these models is impeded by the quantitative nature of experimental measurements of cellular states. In this paper, we provide and assess a new methodology,scBoolSeqfor bridging single-cell level pseudocounts of RNA transcripts with Boolean classification of gene activity levels. Our method, implemented as a Python package, enables both tobinarisescRNA-Seq data in order to match quantitative measurements with states of logicals models, and to generate synthetic data from Boolean trajectories in order to benchmark inference methods. We show thatscBoolSeqaccurately captures main statistical features of scRNA-Seq data, including measurement dropouts, improving significantly the state of the art. Overall, scBoolSeq brings a statistically-grounded method for enabling the inference and validation of qualitative models from scRNA-Seq data.

Publisher

Cold Spring Harbor Laboratory

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3