Author:
Katki Hormuzd A.,Berndt Sonja I.,Machiela Mitchell J.,Stewart Douglas R.,Garcia-Closas Montserrat,Kim Jung,Shi Jianxin,Yu Kai,Rothman Nathaniel
Abstract
Abstract
Background
The rule of thumb that there is little gain in statistical power by obtaining more than 4 controls per case, is based on type-1 error α = 0.05. However, association studies that evaluate thousands or millions of associations use smaller α and may have access to plentiful controls. We investigate power gains, and reductions in p-values, when increasing well beyond 4 controls per case, for small α.
Methods
We calculate the power, the median expected p-value, and the minimum detectable odds-ratio (OR), as a function of the number of controls/case, as α decreases.
Results
As α decreases, at each ratio of controls per case, the increase in power is larger than for α = 0.05. For α between 10–6 and 10–9 (typical for thousands or millions of associations), increasing from 4 controls per case to 10–50 controls per case increases power. For example, a study with power = 0.2 (α = 5 × 10–8) with 1 control/case has power = 0.65 with 4 controls/case, but with 10 controls/case has power = 0.78, and with 50 controls/case has power = 0.84. For situations where obtaining more than 4 controls per case provides small increases in power beyond 0.9 (at small α), the expected p-value can decrease by orders-of-magnitude below α. Increasing from 1 to 4 controls/case reduces the minimum detectable OR toward the null by 20.9%, and from 4 to 50 controls/case reduces by an additional 9.7%, a result which applies regardless of α and hence also applies to “regular” α = 0.05 epidemiology.
Conclusions
At small α, versus 4 controls/case, recruiting 10 or more controls/cases can increase power, reduce the expected p-value by 1–2 orders of magnitude, and meaningfully reduce the minimum detectable OR. These benefits of increasing the controls/case ratio increase as the number of cases increases, although the amount of benefit depends on exposure frequencies and true OR. Provided that controls are comparable to cases, our findings suggest greater sharing of comparable controls in large-scale association studies.
Funder
National Institutes of Health
Publisher
Springer Science and Business Media LLC
Subject
Health Informatics,Epidemiology
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献