Author:
Lu Jonathan,Dumitrascu Bianca,McDowell Ian C.,Jo Brian,Barrera Alejandro,Hong Linda K.,Leichter Sarah M.,Reddy Timothy E.,Engelhardt Barbara E.
Abstract
AbstractGene regulatory network inference is essential to uncover complex relationships among gene pathways and inform downstream experiments, ultimately paving the way for regulatory network re-engineering. Network inference from transcriptional time series data requires accurate, interpretable, and efficient determination of causal relationships among thousands of genes. Here, we develop Bootstrap Elastic net regression from Time Series (BETS), a statistical framework based on Granger causality for the recovery of a directed gene network from transcriptional time series data. BETS uses elastic net regression and stability selection from bootstrapped samples to infer causal relationships among genes. BETS is highly parallelized, enabling efficient analysis of large transcriptional data sets. We show competitive accuracy on a community benchmark, the DREAM4 100-gene network inference challenge, where BETS is one of the fastest among methods of similar performance but additionally infers whether the causal effects are activating or inhibitory. We apply BETS to transcriptional time series data of 2, 768 differentially-expressed genes from A549 cells exposed to glucocorticoids over a period of 12 hours. We identify a network of 2, 768 genes and 31, 945 directed edges (FDR ≤ 0.2). We validate inferred causal network edges using two external data sources: overexpression experiments on the same glucocorticoid system, and genetic variants associated with inferred edges in primary lung tissue in the Genotype-Tissue Expression (GTEx) v6 project. BETS is freely available as an open source software package athttps://github.com/lujonathanh/BETS.
Publisher
Cold Spring Harbor Laboratory
Reference107 articles.
1. Studying and modelling dynamic biological processes using time-series gene expression data
2. Bayesian factor regression models in the “large p, small n” paradigm;Bayesian Statistics,2003
3. High-dimensional statistics with a view toward applications in biology;Annual Review of Statistics and Its Application,2014
4. Circadian clock function in Arabidopsis thaliana: time beyond transcription
5. Learning non-stationary dynamic bayesian networks;Journal of Machine Learning Research,2010
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献