Author:
Mathur Maya B.,VanderWeele Tyler J.
Abstract
When investigators test multiple outcomes or fit different model specifications to the same dataset, as in multiverse analyses, the resulting test statistics may be correlated. We propose new multiple-testing metrics that compare the observed number of hypothesis test rejections (θ^) at an unpenalized α-level to the distribution of rejections that would be expected if all tested null hypotheses held (the “global null”). Specifically, we propose reporting a “null interval” for the number of α-level rejections expected to occur in 95% of samples under the global null, the difference between θ^ and the upper limit of the null interval (the “excess hits”), and a one-sided joint test based on θ^ of the global null. For estimation, we describe resampling algorithms that asymptotically recover the sampling distribution under the global null. These methods accommodate arbitrarily correlated test statistics and do not require high-dimensional analyses, though they also accommodate such analyses. In a simulation study, we assess properties of the proposed metrics under varying correlation structures as well as their power for outcome-wide inference relative to existing methods for controlling familywise error rate. We recommend reporting our proposed metrics along with appropriate measures of effect size for all tests. We provide an R package, NRejections. Ultimately, existing procedures for multiple hypothesis testing typically penalize inference in each test, which is useful to temper interpretation of individual findings; yet on their own, these procedures do not fully characterize global evidence strength across the multiple tests. Our new metrics help remedy this limitation.
Funder
National Institutes of Health
John D. and Catherine T. MacArthur Foundation
National Institute on Aging
Subject
Applied Mathematics,Statistics and Probability
Reference41 articles.
1. Outcome-wide epidemiology;VanderWeele;Epidemiology.,2017
2. Specification curve analysis;Simonsohn;Nat Hum Behav.,2020
3. Multiple comparisons among means;Dunn;J Am Stat Assoc.,1961
4. A simple sequentially rejective multiple test procedure;Holm;Scand J Stat.,1979