Abstract
AbstractWe develop a computationally efficient alternative, TwinEQTL, to a linear mixed-effects model (LMM) for twin genome-wide association study (GWAS) data. Instead of analyzing all twin samples together with LMM, TwinEQTL first splits twin samples into two independent groups on which multiple linear regression analysis can be validly performed separately, followed by an appropriate meta-analysis-like approach to combine the two non-independent test results. Through mathematical derivations, we prove the validity of TwinEQTL algorithm and show that the correlation between two dependent test statistics at each single-nucleotide polymorphism (SNP) are independent of its minor allele frequency (MAF). Thus the correlation is constant across all SNPs. Through simulations, we show empirically that TwinEQTL has well controlled type I error with negligible power loss compared to the gold-standard linear mixed effects models. To accommodate eQTL analysis with twin subjects, we further implement TwinEQTL into a R package with much improved computational efficiency. Our approaches provide a significant leap in terms of computing speed for GWAS and eQTL analysis with twin samples.
Publisher
Cold Spring Harbor Laboratory