Abstract
AbstractSequence-level data offers insights into biological processes through the interaction of two or more genomic features from the same or different molecular data types. Within motifs, this interaction is often explored via the co-occurrence of feature genomic tracks using fixed-segments or analytical tests that respectively require window size determination and risk of false positives from over-simplified models. Moreover, methods for robustly examining the co-localization of genomic features, and thereby understanding their spatial interaction, have been elusive. We present a new analytical method for examining feature interaction by introducing the notion of reciprocal co-occurrence, define statistics to estimate it, and hypotheses to test for it. Our approach leverages conditional motif co-occurrence events between features to infer their co-localization. Using reverse conditional probabilities and introducing a novel simulation approach that retains motif properties (e.g., length, guanine-content), our method further accounts for potential confounders in testing. As a proof-of-concept, MoCoLo confirmed the co-occurrence of histone markers in a breast cancer cell line. As a novel analysis, MoCoLo identified significant co-localization of oxidative DNA damage within non-B DNA forming regions that significantly differed between non-B DNA structures. Altogether, these findings demonstrate the potential utility of MoCoLo for testing spatial interactions between genomic features via their co-localization.
Publisher
Cold Spring Harbor Laboratory