Abstract
AbstractStatistical epistasis has been studied extensively because of its potential to provide evidence for genetic interactions for phenotypes, but there have been methodological limitations to its exhaustive, widespread application. We present new algorithms for the interaction coefficients for standard regression models for epistasis that permit many varied encodings for the interaction terms for loci and efficient memory usage. The algorithms are given for two-way and three-way epistasis and may be generalized to higher order epistasis. Statistical tests for the interaction coefficients are also provided. We also present an efficient matrix based algorithm for permutation testing for two-way epistasis. We offer a proof and experimental evidence that methods that look for epistasis only at loci that have main effects may not be justified. Given the computational efficiency of the algorithm, we applied the method to a rat data set and mouse data set, with at least 10000 loci and 1000 samples each, using the standard Cartesian encoding and the XOR penetrance function for the interactions, to test for evidence of statistical epistasis for the phenotype of body mass index. This study revealed that the XOR penetrance function found greater evidence for statistical epistasis in many more pairs of loci in both data sets and in the rat data set, those pairs of loci found using the XOR penetrance function are enriched for biologically relevant pathways.Author summaryEpistasis, the interaction between two or more genes, is likely integral to the study of genetics and present throughout nature. Yet, it is seldom fully explored as most approaches primarily focus on single-locus effects (such as GWAS), partly because analyzing all pairwise and higher-order interactions requires significant computational resources. Many current methods for epistasis detection only consider a Cartesian encoding for interaction terms. This is likely limiting as epistatic interactions can evolve to produce varied relationships between genes, some non-linear. In this work we describe computationally efficient algorithms for the detection of statistical epistasis that allow for varied interaction encodings for modeling epistasis. Our methodology efficiently detects pairwise and three-way epistatic interactions in two closely related species (rat and mouse) under both Cartesian and XOR interaction encodings. Our results in both species show that many biologically relevant epistatic relationships would have been undetected if only one interaction encoding was applied providing evidence that more varied models for interaction may need to be applied to describe epistasis that occurs in living systems.
Publisher
Cold Spring Harbor Laboratory