Priors, population sizes, and power in genome-wide hypothesis tests-Reference-Cited by-同舟云学术

Priors, population sizes, and power in genome-wide hypothesis tests

Published:2023-04-26 Issue:1 Volume:24 Page:
ISSN:1471-2105
Container-title:BMC Bioinformatics
language:en
Short-container-title:BMC Bioinformatics

Author:

Cai Jitong,Zhan Jianan,Arking Dan E.,Bader Joel S.^ORCID

Abstract

Abstract Background Genome-wide tests, including genome-wide association studies (GWAS) of germ-line genetic variants, driver tests of cancer somatic mutations, and transcriptome-wide association tests of RNAseq data, carry a high multiple testing burden. This burden can be overcome by enrolling larger cohorts or alleviated by using prior biological knowledge to favor some hypotheses over others. Here we compare these two methods in terms of their abilities to boost the power of hypothesis testing. Results We provide a quantitative estimate for progress in cohort sizes and present a theoretical analysis of the power of oracular hard priors: priors that select a subset of hypotheses for testing, with an oracular guarantee that all true positives are within the tested subset. This theory demonstrates that for GWAS, strong priors that limit testing to 100–1000 genes provide less power than typical annual 20–40% increases in cohort sizes. Furthermore, non-oracular priors that exclude even a small fraction of true positives from the tested set can perform worse than not using a prior at all. Conclusion Our results provide a theoretical explanation for the continued dominance of simple, unbiased univariate hypothesis tests for GWAS: if a statistical question can be answered by larger cohort sizes, it should be answered by larger cohort sizes rather than by more complicated biased methods involving priors. We suggest that priors are better suited for non-statistical aspects of biology, such as pathway structure and causality, that are not yet easily captured by standard hypothesis tests.

Funder

NIH/NCI

NIH/NHBLI

Jayne Koskinas Ted Giovanis Foundation for Health and Policy

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology

Link

https://link.springer.com/content/pdf/10.1186/s12859-023-05261-9.pdf

Reference19 articles.

1. Risch N, Merikangas K. The future of genetic studies of complex human diseases. Science. 1996;273(5281):1516–7.

2. Moore GE. Cramming more components onto integrated circuits. Electronics. 1965;38(8):114–7.

3. Carlson R. The pace and proliferation of biological technologies. Biosecur Bioterror. 2003;1(3):203–14.

4. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio TA. Potential etiologic and functional implications of genome-wide association loci for human diseases. Proc Natl Acad Sci. 2009;106(23):9352–67.

5. Schork AJ, Thompson WK, Pham P, et al. All SNPs are not created equal: Genome-wide association studies reveal a consistent pattern of enrichment among functionally annotated SNPs. PLoS Genet. 2013;9(4):1003449.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Cabergoline as a Novel Strategy for Post-Pregnancy Breast Cancer Prevention in Mice and Human;2024-02-05