A map of <i>cis</i>-regulatory modules and constituent transcription factor binding sites in 80% of the mouse genome-Reference-Cited by-同舟云学术

A map of cis-regulatory modules and constituent transcription factor binding sites in 80% of the mouse genome

Published:2022-05-30 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Ni Pengyu^ORCID,Wilson David,Su Zhengchang^ORCID

Abstract

ABSTRACTMouse is probably the most important model organism to study mammal biology and human diseases. A better understanding of the mouse genome will help understand the human genome, biology and diseases. However, despite the recent progress, characterization of the regulatory sequences in the mouse genome is still far from complete, limiting its use to understand the regulatory sequences in the human genome. Here, by integrating binding peaks in 9,060 transcription factor (TF) ChIP-seq datasets that cover 79.9% of the mouse mappable genome using an efficient pipeline, we were able to partition these binding peak-covered genome regions into a cis-regulatory module (CRM) candidate (CRMC) set and a non-CRMC sets. The CRMCs contain 912,197 putative CRMs and 38,554,729 TF binding sites (TFBSs) islands, covering 55.5% and 24.4% of the mappable genome, respectively. The CRMCs tend to be under strongly evolutionary constraints, indicating that they are likely cis-regulatory; while the non-CRMCs are largely selectively neutral, indicating that they are unlikely cis-regulatory. Based on evolutionary profiles of the genome positions, we further estimated that 63.8% and 27.4% of the mouse genome might code for CRMs and TFBSs, respectively. Validation using experimental data suggests that at least most of the CRMCs are authentic. Thus, this unprecedentedly comprehensive map of CRMs and TFBSs can be a good resource to guide experimental studies of regulatory genomes in mice and humans.

Publisher

Cold Spring Harbor Laboratory

Reference120 articles.

1. Identification of cis regulatory features in the embryonic zebrafish genome through large-scale profiling of H3K4me1 and H3K4me3 binding sites

2. Insights gained from a comprehensive all-against-all transcription factor binding motif benchmarking study

3. An atlas of active enhancers across human cell types and tissues

4. Exploiting regulatory heterogeneity to systematically identify enhancers with high accuracy

5. Fine Tuning of Craniofacial Morphology by Distant-Acting Enhancers