Abstract
AbstractGenome-wide protein interaction assays aspire to map the complete binding pattern of gene regulators. How-ever, common practice relies on replication and high stringency statistics which favor false negatives over false positives, thereby excluding portions of signal which may represent biologically relevant events. Here, we present ICEBERG (Increased Capture of Enrichment By Exhaustive Replicate aGgregation), an experimental and analytical pipeline that harnesses large numbers of CUT&RUN replicates to discover the full set of binding events and chart the line between false positives and false negatives. We employed ICEBERG to map the full set of H3K4me3-marked regulatory regions and β-catenin targets in human colorectal cancer cells. The ICE-BERG datasets allow benchmarking of individual replicates, comparison of the performance of peak calling and replication approaches and expose the arbitrary nature of other strategies to identify reproducible peaks. Instead of a static view of genomic targets, ICEBERG established a spectrum of detection probabilities across the genome for a given factor, underlying the intrinsic dynamicity of its mechanism of action, and permitting to distinguish frequent from rare regulation events. Finally, ICEBERG discovered instances, undetectable with other approaches, that might underlie novel mechanisms of colorectal cancer progression.
Publisher
Cold Spring Harbor Laboratory