Abstract
What features of transcription can be learnt by fitting mathematical models of gene expression to mRNA count data? Given a suite of models, fitting to data selects an optimal one, thus identifying a probable transcriptional mechanism. Whilst attractive, the utility of this methodology remains unclear. Here, we sample steady-state, single-cell mRNA count distributions from parameters in the physiological range, and show they cannot be used to confidently estimate the number of inactive gene states, i.e. the number of rate-limiting steps in transcriptional initiation. Distributions from over 99% of the parameter space generated using models with 2, 3, or 4 inactive states can be well fit by one with a single inactive state. However, we show that if the mRNA lifetime is hours long, then for many minutes following induction, the increase in the mean mRNA count obeys a power law whose exponent equals the sum of the number of states visited from the initial inactive state to the active state and the number of rate-limiting post-transcriptional processing steps. Our study shows that non-linear regression estimation of the exponent from eukaryotic data is sufficient to estimate the total number of regulatory steps in transcription initiation, splicing, and nuclear export.
Publisher
Cold Spring Harbor Laboratory