Abstract
AbstractIt is widely recognized that the missing heritability of many human diseases is partially due to noncoding genetic variants, but there are multiple challenges that hinder the identification of functional disease-associated noncoding variants. The number of noncoding variants can be many times of coding variants; many of them are not functional but in linkage disequilibrium with the functional ones; different variants can have epistatic effects; different variants can affect the same genes or pathways in different individuals, and some variants are related to each other not by affecting the same gene but by affecting the binding of the same upstream regulator. To overcome these difficulties, we propose a novel analysis framework that considers convergent impacts of different genetic variants on protein binding, which provides multi-granular information about disease-associated perturbations of regulatory elements, genes, and pathways. Applying it to our whole-genome sequencing data of 918 short-segment Hirschsprung disease patients and matched controls, we identify various novel genes not detected by standard single-variant and region-based tests, functionally centering on neural crest migration and development. Our framework also identifies upstream regulators whose binding is influenced by the noncoding variants. Using human neural crest cells, we confirm cell-stage-specific regulatory roles three top novel regulatory elements on our list, respectively in the RET, RASGEF1A and PIK3C2B loci. In the PIK3C2B regulatory element, we further show that a noncoding variant found only in the affects the binding of the gliogenesis regulator NFIA, with a corresponding down-regulation of multiple genes in the same topologically associating domain.
Publisher
Cold Spring Harbor Laboratory