False discovery rate control in genome-wide association studies with population structure-Reference-Cited by-同舟云学术

False discovery rate control in genome-wide association studies with population structure

Published:2021-09-27 Issue:40 Volume:118 Page:e2105841118
ISSN:0027-8424
Container-title:Proceedings of the National Academy of Sciences
language:en
Short-container-title:Proc Natl Acad Sci USA

Author:

Sesia Matteo^ORCID,Bates Stephen,Candès Emmanuel,Marchini Jonathan,Sabatti Chiara

Abstract

We present a comprehensive statistical framework to analyze data from genome-wide association studies of polygenic traits, producing interpretable findings while controlling the false discovery rate. In contrast with standard approaches, our method can leverage sophisticated multivariate algorithms but makes no parametric assumptions about the unknown relation between genotypes and phenotype. Instead, we recognize that genotypes can be considered as a random sample from an appropriate model, encapsulating our knowledge of genetic inheritance and human populations. This allows the generation of imperfect copies (knockoffs) of these variables that serve as ideal negative controls, correcting for linkage disequilibrium and accounting for unknown population structure, which may be due to diverse ancestries or familial relatedness. The validity and effectiveness of our method are demonstrated by extensive simulations and by applications to the UK Biobank data. These analyses confirm our method is powerful relative to state-of-the-art alternatives, while comparisons with other studies validate most of our discoveries. Finally, fast software is made available for researchers to analyze Biobank-scale datasets.

Funder

National Science Foundation

Simons Foundation

Publisher

Proceedings of the National Academy of Sciences

Subject

Multidisciplinary

Reference82 articles.

1. The Future of Genetic Studies of Complex Human Diseases

2. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls

3. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019

4. C. Sabatti , “Multivariate linear models for GWAS” in, Advances in Statistical Bioinformatics, K.-A. Do , Z. S. Qin , M. Vannucci , Eds. (Cambridge University Press, 2013), pp. 188–207.

5. Assessing statistical significance in multivariable genome wide association analysis

Cited by 38 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. BLESS: bagged logistic regression for biomarker identification;Frontiers in Genetics;2024-09-10

2. Undetected Association Between Fatty Acids and Dementia with Lewy Bodies: A Bidirectional Two-Sample Mendelian Randomization Study;Journal of Alzheimer's Disease;2024-07-30

3. La replicabilidad en la ciencia y el papel transformador de la metodología estadística de knockoffs;SAHUARUS. REVISTA ELECTRÓNICA DE MATEMÁTICAS. ISSN: 2448-5365;2024-06-30

4. Catch me if you can: signal localization with knockoff e-values;Journal of the Royal Statistical Society Series B: Statistical Methodology;2024-06-14

5. Reconciling model-X and doubly robust approaches to conditional independence testing;The Annals of Statistics;2024-06-01