Inference of Transcriptional Regulation From STARR-seq Data

Author:

Safaeesirat AminORCID,Taeb Hoda,Tekoglu EmirhanORCID,Morova Tunc,Lack Nathan A.,Emberly Eldon

Abstract

AbstractOne of the primary regulatory processes in cells is transcription, during which RNA polymerase II (Pol-II) transcribes DNA into RNA. The binding of Pol-II to its site is regulated through interactions with transcription factors (TFs) that bind to DNA at enhancer cis-regulatory elements. Measuring the enhancer activity of large libraries of distinct DNA sequences is now possible using Massively Parallel Reporter Assays (MPRAs), and computational methods have been developed to identify the dominant statistical patterns of TF binding within these large datasets. Such methods are global in their approach and may overlook important regulatory sites which function only within the local context. Here we introduce a method for inferring functional regulatory sites (their number, location and width) within an enhancer sequence based on measurements of its transcriptional activity from an MPRA method such as STARR-seq. The model is based on a mean-field thermodynamic description of Pol-II binding that includes interactions with bound TFs. Our method applied to simulated STARR-seq data for a variety of enhancer architectures shows how data quality impacts the inference and also how it can find local regulatory sites that may be missed in a global approach. We also apply the method to recently measured STARR-seq data on androgen receptor (AR) bound sequences, a TF that plays an important role in the regulation of prostate cancer. The method identifies key regulatory sites within these sequences which are found to overlap with binding sites of known co-regulators of AR.1Author SummaryWe present an inference method for identifying regulatory sites within a putative DNA enhancer sequence, given only the measured transcriptional output of a set of overlapping sequences using an assay like STARR-seq. It is based on a mean-field thermodynamic model that calculates the binding probability of Pol-II to its promoter and includes interactions with sites in the DNA sequence of interest. By maximizing the likelihood of the data given the model, we can infer the number of regulatory sites, their locations, and their widths. Since it is a local model, it can in principle find regulatory sites that are important within a local context that may get missed in a global fit. We test our method on simulated data of simple enhancer architectures and show that it is able to find only the functional sites. We also apply our method to experimental STARR-seq data from 36 androgen receptor bound DNA sequences from a prostate cancer cell line. The inferred regulatory sites overlap known important regulatory motifs and their ChIP-seq data in these regions. Our method shows potential at identifying locally important functional regulatory sites within an enhancer given only its measured transcriptional output.

Publisher

Cold Spring Harbor Laboratory

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3