Abstract
AbstractPredicted loss-of-function variants (pLoFs) are often associated with disease. For genes linked with monogenic diseases, we hypothesised that pLoFs present in apparently unaffected individuals may cluster in LoF-tolerant regions. We compared the distribution of pLoFs in ClinVar versus 454,773 individuals in UK Biobank and clustered the variants using Gaussian mixture models. We found that genes in which haploinsufficiency causes developmental disorders with incomplete penetrance were less likely to have a uniform pLoF distribution than other genes (P<2.2x10-6). In some cases (e.g.,ARID1BandGATA6), pLoF variants in the first quarter of the gene could be rescued by an alternative translation start site and should not be reported as pathogenic. In other cases (e.g.,ODC1), pathogenic pLoFs were clustered only at the end of the gene, consistent with a gain-of-function disease mechanism. Our results support the use of localised constraint metrics when interpreting variants.
Publisher
Cold Spring Harbor Laboratory