Abstract
AbstractBayesian inference produces a posterior distribution for the parameters and predictions from a mathematical model that can be used to guide the formation of hypotheses; specifically, the posterior may be searched for evidence of alternative model hypotheses, which serves as a starting point for hypothesis formation and model refinement. Previous approaches to search for this evidence are largely qualitative and unsystematic; further, demonstrations of these approaches typically stop at hypothesis formation, leaving the questions they raise unanswered. Here, we introduce a Kullback-Leibler (KL) divergence-based ranking to expedite Bayesian hypothesis formation and investigate the hypotheses it generates, ultimately generating novel, biologically significant insights. Our approach uses KL divergence to rank parameters by how much information they gain from experimental data. Subsequently, rather than searching all model parameters at random, we use this ranking to prioritize examining the posteriors of the parameters that gained the most information from the data for evidence of alternative model hypotheses. We test our approach with two examples, which showcase the ability of our approach to systematically uncover different types of alternative hypothesis evidence. First, we test our KL divergence ranking on an established example of Bayesian hypothesis formation.Our top-ranked parameter matches the one previously identified to produce alternative hypotheses. In the second example, we apply our ranking in a novel study of a computational model of prolactin-induced JAK2-STAT5 signaling, a pathway that mediates beta cell proliferation. Here, we cluster our KL divergence rankings to select only a subset of parameters to examine for qualitative evidence of alternative hypotheses, thereby expediting hypothesis formation. Within this subset, we find a bimodal posterior revealing two possible ranges for the prolactin receptor degradation rate. We go on to refine the model, incorporating new data and determining which degradation rate is most plausible. Overall, we demonstrate that our approach offers a novel quantitative framework for Bayesian hypothesis formation and use it to produce a novel, biologically-significant insight.
Publisher
Cold Spring Harbor Laboratory