A Safe Hosmer-Lemeshow Test-Reference-Cited by-同舟云学术

A Safe Hosmer-Lemeshow Test

Published:2023 Issue: Volume: Page:1-15
ISSN:2693-7166
Container-title:The New England Journal of Statistics in Data Science
language:en
Short-container-title:

Author:

Henzi Alexander,Puke Marius,Dimitriadis Timo,Ziegel Johanna

Abstract

This article proposes an alternative to the Hosmer-Lemeshow (HL) test for evaluating the calibration of probability forecasts for binary events. The approach is based on e-values, a new tool for hypothesis testing. An e-value is a random variable with expected value less or equal to one under a null hypothesis. Large e-values give evidence against the null hypothesis, and the multiplicative inverse of an e-value is a p-value. Our test uses online isotonic regression to estimate the calibration curve as a ‘betting strategy’ against the null hypothesis. We show that the test has power against essentially all alternatives, which makes it theoretically superior to the HL test and at the same time resolves the well-known instability problem of the latter. A simulation study shows that a feasible version of the proposed eHL test can detect slight miscalibrations in practically relevant sample sizes, but trades its universal validity and power guarantees against a reduced empirical power compared to the HL test in a classical simulation setup. We illustrate our test on recalibrated predictions for credit card defaults during the Taiwan credit card crisis, where the classical HL test delivers equivocal results.

Publisher

New England Statistical Society

Reference44 articles.

1. An empirical distribution function for sampling with incomplete information;Annals of Mathematical Statistics,1955

2. One model, several results: the paradox of the Hosmer-Lemeshow goodness-of-fit test for the logistic regression model;Journal of Epidemiology and Biostatistics,2000

3. Bagging predictors;Machine Learning,1996

4. Conditional expectation given a σ-lattice and applications;Annals of Mathematical Statistics,1965

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Editorial. Game-Theoretic Statistics and Safe Anytime-Valid Inference;The New England Journal of Statistics in Data Science;2024

2. Associations between long-term exposure to air pollution, diabetes, and hypertension in metropolitan Iran: an ecologic study;International Journal of Environmental Health Research;2023-09-06