Affiliation:
1. School of Computer Science Qufu Normal University Rizhao China
2. School of Information and Control Engineering Qingdao University of Technology Qingdao China
3. School of Health and Life Sciences University of Health and Rehabilitation Sciences Qingdao China
Abstract
AbstractEpistasis is a ubiquitous phenomenon in genetics, and is considered to be one of main factors in current efforts to unveil missing heritability of complex diseases. Simulation data is crucial for evaluating epistasis detection tools in genome‐wide association studies (GWAS). Existing simulators normally suffer from two limitations: absence of support for high‐order epistasis models containing multiple single nucleotide polymorphisms (SNPs), and inability to generate simulation SNP data independently. In this study, we proposed a simulator SimHOEPI, which is capable of calculating penetrance tables of high‐order epistasis models depending on either prevalence or heritability, and uses a resampling strategy to generate simulation data independently. Highlights of SimHOEPI are the preservation of realistic minor allele frequencies in sampling data, the accurate calculation and embedding of high‐order epistasis models, and acceptable simulation time. A series of experiments were carried out to verify these properties from different aspects. Experimental results show that SimHOEPI can generate simulation SNP data independently with high‐order epistasis models, implying that it might be an alternative simulator for GWAS.