Abstract
AbstractIn this work, we introduce a new methodology for inferring the interaction structure of discrete valued time series which are Poisson distributed. While most related methods are premised on continuous state stochastic processes, in fact, discrete and counting event oriented stochastic process are natural and common, so called time-point processes. An important application that we focus on here is gene expression, where it is often assumed that the data is generated from a multivariate Poisson distribution. Nonparameteric methods such as the popular k-nearest neighbors are slow converging for discrete processes, and thus data hungry. Now, with the new multi-variate Poisson estimator developed here as the core computational engine, the causation entropy (CSE) principle, together with the associated greedy search algorithm optimal CSE (oCSE) allows us to efficiently infer the true network structure for this class of stochastic processes that were previously not practical. We illustrate the power of our method, first in benchmarking with synthetic datum, and then by inferring the genetic factors network from a breast cancer micro-ribonucleic acid sequence count data set. We show the Poisson oCSE gives the best performance among the tested methods and discovers previously known interactions on the breast cancer data set.
Funder
Defense Advanced Research Projects Agency
Publisher
Springer Science and Business Media LLC
Subject
Computational Mathematics,Computer Networks and Communications,Multidisciplinary