A novel application of data‐consistent inversion to overcome spurious inference in genome‐wide association studies-Reference-Cited by-同舟云学术

A novel application of data‐consistent inversion to overcome spurious inference in genome‐wide association studies

Published:2024-04-21 Issue:6 Volume:48 Page:270-288
ISSN:0741-0395
Container-title:Genetic Epidemiology
language:en
Short-container-title:Genetic Epidemiology

Author:

Janani Negar¹^ORCID,Young Kendra A.²^ORCID,Kinney Greg²,Strand Matthew³,Hokanson John E.²,Liu Yaning¹,Butler Troy¹,Austin Erin¹

Affiliation:

1. Department of Mathematical and Statistical Sciences University of Colorado Denver Denver Colorado USA

2. Department of Epidemiology Colorado School of Public Health Aurora Colorado USA

3. Division of Biostatistics National Jewish Health Denver Colorado USA

Abstract

AbstractThe genome‐wide association studies (GWAS) typically use linear or logistic regression models to identify associations between phenotypes (traits) and genotypes (genetic variants) of interest. However, the use of regression with the additive assumption has potential limitations. First, the normality assumption of residuals is the one that is rarely seen in practice, and deviation from normality increases the Type‐I error rate. Second, building a model based on such an assumption ignores genetic structures, like, dominant, recessive, and protective‐risk cases. Ignoring genetic variants may result in spurious conclusions about the associations between a variant and a trait. We propose an assumption‐free model built upon data‐consistent inversion (DCI), which is a recently developed measure‐theoretic framework utilized for uncertainty quantification. This proposed DCI‐derived model builds a nonparametric distribution on model inputs that propagates to the distribution of observed data without the required normality assumption of residuals in the regression model. This characteristic enables the proposed DCI‐derived model to cover all genetic variants without emphasizing on additivity of the classic‐GWAS model. Simulations and a replication GWAS with data from the COPDGene demonstrate the ability of this model to control the Type‐I error rate at least as well as the classic‐GWAS (additive linear model) approach while having similar or greater power to discover variants in different genetic modes of transmission.

Publisher

Wiley

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1002/gepi.22563

Reference34 articles.

1. A Rapid Gene-Based Genome-Wide Association Test with Multivariate Traits

2. To stratify or not to stratify: power considerations for population-based genome-wide association studies of quantitative traits

3. Chromatin Landscapes of Human Lung Cells Predict Potentially Functional Chronic Obstructive Pulmonary Disease Genome-Wide Association Study Variants

4. Chapter 11: Genome-Wide Association Studies

5. Data-consistent inversion for stochastic input-to-output maps