Abstract
AbstractThe growth of bacterial gene expression datasets has offered unprecedented coverage of achievable transcriptomes, reflecting diverse activity states of the transcription regulatory network. Machine learning methods like Independent Component Analysis (ICA) can decompose gene expression datasets into regulatory modules and condition-specific regulator activities. Here, we present a workflow to utilize inferred regulator activities to construct quantitative models of promoter regulation inE. coli. Resulting models are validated by predicting condition-specific TF effector concentrations and binding site motif strength based on differential gene expression data alone. We show how reconstructed promoter models can capture multi-scale regulation and disentangle regulator interactions, including resolving the apparent paradox whereargRexpression is positively correlated with its regulon despite being a repressor. We applied the workflow for all regulator-linked components extracted by ICA, demonstrating the scalability of the workflow to capture theE. coliTRN. This work suggests a path toward systematic, quantitative reconstruction of transcription regulatory networks driven by the large-scale databases that are now available for many organisms.
Publisher
Cold Spring Harbor Laboratory