Affiliation:
1. School of Statistics and Data Science Nankai University Tianjin China
2. Division of Cancer Epidemiology and Genetics National Cancer Institute Rockville Maryland USA
Abstract
AbstractMendelian randomization (MR) is a statistical method that utilizes genetic variants as instrumental variables (IVs) to investigate causal relationships between risk factors and outcomes. Although MR has gained popularity in recent years due to its ability to analyze summary statistics from genome‐wide association studies (GWAS), it requires a substantial number of single nucleotide polymorphisms (SNPs) as IVs to ensure sufficient power for detecting causal effects. Unfortunately, the complex genetic heritability of many traits can lead to the use of invalid IVs that affect both the risk factor and the outcome directly or through an unobserved confounder. This can result in biased and imprecise estimates, as reflected by a larger mean squared error (MSE). In this study, we focus on the widely used two‐stage least squares (2SLS) method and derive formulas for its bias and MSE when estimating causal effects using invalid IVs. Using those formulas, we identify conditions under which the 2SLS estimate is unbiased and reveal how the independent or correlated pleiotropic effects influence the accuracy and precision of the 2SLS estimate. We validate these formulas through extensive simulation studies and demonstrate the application of those formulas in an MR study to evaluate the causal effect of the waist‐to‐hip ratio on various sleeping patterns. Our results can aid in designing future MR studies and serve as benchmarks for assessing more sophisticated MR methods.
Subject
Genetics (clinical),Epidemiology