Author:
Durdu Sevi,Iskar Murat,Isbel Luke,Hoerner Leslie,Wirbelauer Christiane,Burger Lukas,Hess Daniel,Iesmantavicius Vytautas,Schübeler Dirk
Abstract
ABSTRACTTranscription factors recognizing short DNA sequences within gene regulatory regions are crucial drivers of cell identity. Despite recent advances, their specificity remains incompletely understood. Here, we address this by contrasting two TFs, NGN2 and MyoD1, which recognize ubiquitous E-box motifs yet instigate distinct cell fates—neurons and muscles, respectively. Following controlled induction in embryonic stem cells, we monitor binding across differentiation trajectories, employing an interpretable machine-learning approach integrating pre-existing DNA accessibility data. This reveals a chromatin-dependent motif syntax, delineating both common and factor-specific binding and predicting genome engagement with high precision. Shared binding sites reside in open chromatin, locally influenced by nucleosomes. In contrast, factor-specific binding in closed chromatin involves NGN2 and MyoD1 acting as pioneer-factors, influenced by multi-motifs, rotational spacing, flanking sequences, and specific interaction partners, accounting for subsequent lineage divergence. Extending our methodology to other models demonstrates how such combination of opportunistic-binding and context-specific chromatin opening underpin transcription factor specificity driving differentiation trajectories.
Publisher
Cold Spring Harbor Laboratory