Author:
Wu Yue,Eskin Eleazar,Sankararaman Sriram
Abstract
AbstractImputation has been widely utilized to aid and interpret the results of Genome-Wide Association Studies(GWAS). Imputation can increase the power to identify associations when the causal variant was not directly observed or typed in the GWAS. There are two broad classes of methods for imputation. The first class imputes the genotypes at the untyped variants given the genotypes at the typed variants and then performs a statistical test of association at the imputed variants. The second class of methods, summary statistic imputation, directly imputes the association statics at the untyped variants given the association statistics observed at the typed variants. This second class of methods is appealing as it tends to be computationally efficient while only requiring the summary statistics from a study while the former class requires access to individual-level data that can be difficult to obtain. The statistical properties of these two classes of imputation methods have not been fully understood. In this paper, we show that the two classes of imputation methods are equivalent, i.e., have identical asymptotic multivariate normal distributions with zero mean and minor variations in the covariance matrix, under some reasonable assumptions. Using this equivalence, we can understand the effect of imputation methods on power. We show that a commonly employed modification of summary statistic imputation that we term summary statistic imputation with variance re-weighting generally leads to a loss in power. On the other hand, our proposed method, summary statistic imputation without performing variance re-weighting, fully accounts for imputation uncertainty while achieving better power.
Publisher
Cold Spring Harbor Laboratory