Author:
Abegaz Fentaw,Van Lishout François,Mahachie John Jestinah M,Chiachoompu Kridsadakorn,Bhardwaj Archana,Gusareva Elena S.,Wei Zhi,Hakonarson Hakon,Van Steen Kristel
Abstract
AbstractIn genome-wide association studies, the extent and impact of confounding due population structure have been well recognized. Inadequate handling of such confounding is likely to lead to spurious associations, hampering replication and the identification of causal variants. Several strategies have been developed for protecting associations against confounding, the most popular one is based on Principal Component Analysis. In contrast, the extent and impact of confounding due to population structure in gene-gene interaction association epistasis studies are much less investigated and understood. In particular, the role of non-linear genetic population substructure in epistasis detection is largely under-investigated, especially outside a regression framework. In order to identify causal variants in synergy, to improve interpretability and replicability of epistasis results, we introduce three strategies based on model-based multifactor dimensionality reduction (MB-MDR) approach for structured populations. We demonstrate through extensive simulation studies the effect of various degrees of genetic population structure and relatedness on epistasis detection and propose appropriate remedial measures based on linear and non-linear sample genetic similarity.Authors SummaryOne of the biggest challenges in human genetics is to understand the genetic basis of complex diseases such as cancer, diabetes, heart disease, depression, asthma, inflammatory bowel disease and hypertension, for instance via identifying genes, gene-gene and gene-environment interactions in association studies. Over the years, a more prominent role has been given to gene-gene interaction (epistasis) detection, in view of precision medicine and the hunt for novel drug targets and biomarkers. However, the increasing number of consortium-based epistasis studies that are marked by heterogeneous sample collections due to population structure or shared genetic ancestry are likely to be prone to spurious association and low power detection of associated or causal genes. In this work we introduced various strategies in epistasis studies with correction for confounding due to population structure. Based on extensive simulation studies we demonstrated the effect of genetic population structure on epistasis detection and investigated remedial measures to confounding by linear and nonlinear sample genetic similarity.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献