Abstract
AbstractTraditionally, common testing problems are formalized in terms of a precise null hypothesis representing an idealized situation such as absence of a certain “treatment effect”. However, in most applications the real purpose of the analysis is to assess evidence in favor of a practically relevant effect, rather than simply determining its presence/absence. This discrepancy leads to erroneous inferential conclusions, especially in case of moderate or large sample size. In particular, statistical significance, as commonly evaluated on the basis of a precise hypothesis low p value, bears little or no information on practical significance. This paper presents an innovative approach to the problem of testing the practical relevance of effects. This relies upon the proposal of a general method for modifying standard tests by making them suitable to deal with appropriate interval null hypotheses containing all practically irrelevant effect sizes. In addition, when it is difficult to specify exactly which effect sizes are irrelevant we provide the researcher with a benchmark value. Acceptance/rejection can be established purely by deciding on the (ir)relevance of this value. We illustrate our proposal in the context of many important testing setups, and we apply the proposed methods to two case studies in clinical medicine. First, we consider data on the evaluation of systolic blood pressure in a sample of adult participants at risk for nutritional deficit. Second, we focus on a study of the effects of remdesivir on patients hospitalized with COVID-19.
Funder
Università degli Studi di Brescia
Publisher
Springer Science and Business Media LLC
Reference60 articles.
1. Altman M (2004) Special issue on statistical significance. J Socio-Econ 33:651–663
2. Barndorff-Nielsen O, Cox D (1994) Inference and asymptotics. Chapman & Hall, London
3. Bayarri M, Berger JO (2000) P values for composite null models. J Am Stat Assoc 95(452):1127–1142
4. Benjamin DJ, Berger JO, Johannesson M, Nosek BA, Wagenmakers EJ, Berk R, Bollen KA, Brembs B, Brown L, Camerer C, Cesarini D (2018) Redefine statistical significance. Nat Hum Behav 2(6):115–117
5. Berger RL, Boos DD (1994) P values maximized over a confidence set for the nuisance parameter. J Am Stat Assoc 89(427):1012–1016