Abstract
AbstractIn genetic association analysis of complex traits, detection of interaction (either GxG or GxE) can help to elucidate the genetic architecture and biological mechanisms underlying the trait. Detection of interaction in a genome-wide association study (GWAS) can be methodologically challenging for various reasons, including a high burden of multiple comparisons when testing for epistasis between all possible pairs of a set of genomewide variants, as well as heteroscedasticity effects occurring in the presence of GxG or GxE interaction. In this paper, we address the problem of an even more striking phenomenon that we call the “feast or famine” effect that occurs when testing interaction in a genomewide context. As we verify, even in a simplified setting in which there is no interaction at all (and so no heteroscedasticity), in a GWAS to detect GxG or GxE interaction with a fixed genetic variant or environmental factor, the distribution of the genome-wide p-values under the null hypothesis is not the i.i.d. uniform one that is commonly assumed. Using standard methods, even if all SNPs are independent, some GWASs will have systematically underinflated p-values (“feast”), and others will have systematically overinflated p-values (“famine”), which can lead to false detection of interaction, reduced power, inconsistent results across studies, and failure to replicate true signal. This startling phenomenon is specific to detection of interaction in a GWAS, and it may partly explain why such detection has so far proved challenging and difficult to replicate. We show theoretically that the key cause of this phenomenon is which variables are conditioned on in the analysis, and this suggests an approach to correct the problem by changing the way the conditioning is done. Using this insight, we have developed the TINGA method to adjust the interaction test statistics to make their p-values closer to uniform under the null hypothesis. In simulations we show that TINGA both controls type 1 error and improves power. TINGA allows for covariates and population structure through use of a linear mixed model and accounts for heteroscedasticity. We apply TINGA to detection of epistasis in a study of flowering time inArabidopsis thaliana.Author summaryTesting for interactions in GWAS can lead to insight into biological mechanisms, but poses greater challenges than ordinary genetic association GWAS. When testing for interaction in a GWAS setting with one fixed SNP or environmental variable, the standard test statistics may not have the expected statistical properties under the null hypothesis, which can lead to false detection of interaction, inconsistent results across studies, reduced power, and failure to replicate true signal. We propose the TINGA method to adjust the test statistics so that the null distribution of their p-values is closer to uniform. Through simulations and real data analysis, we illustrate the problems with the standard analysis and the improvement of our proposed method.
Publisher
Cold Spring Harbor Laboratory