Author:
Rastogi Chaitanya,Rube H. Tomas,Kribelbauer Judith F.,Crocker Justin,Loker Ryan E.,Martini Gabriella D.,Laptenko Oleg,Freed-Pastor William A.,Prives Carol,Stern David L.,Mann Richard S.,Bussemaker Harmen J.
Abstract
Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes.
Funder
HHS | NIH | National Human Genome Research Institute
HHS | NIH | National Institute of General Medical Sciences
HHS | NIH | National Cancer Institute
HHS | NIH | National Center for Research Resources
Empire State Development's Division of Science, Technology and Innovation
Publisher
Proceedings of the National Academy of Sciences
Cited by
91 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献