Prediction Regions for Poisson and Over-Dispersed Poisson Regression Models with Applications in Forecasting the Number of Deaths during the COVID-19 Pandemic
Author:
Kim Taeho1, Lieberman Benjamin2, Luta George3, Peña Edsel A.2
Affiliation:
1. Department of Statistics , University of Haifa , Haifa , 31905 , Israel 2. Department of Statistics , University of South Carolina , Columbia, SC, 29208, USA 3. Department of Biostatistics, Bioinformatics & Biomathematics , Georgetown University , Washington , District of Columbia, 20057 USA ; Department of Clinical Epidemiology , Aarhus University , Aarhus , DK-8200 , Denmark ; The Parker Institute , Copenhagen University Hospital , Frederiksberg, DK-2000 , Denmark
Abstract
Abstract
Motivated by the Coronavirus Disease (COVID-19) pandemic, which is due to the SARS-CoV-2 virus, and the important problem of forecasting the number of daily deaths and the number of cumulative deaths, this paper examines the construction of prediction regions or intervals under the no-covariate or intercept-only Poisson model, the Poisson regression model, and a new over-dispersed Poisson regression model. These models are useful for settings with events of interest that are rare. For the no-covariate Poisson and the Poisson regression model, several prediction regions are developed and their performances are compared through simulation studies. The methods are applied to the problem of forecasting the number of daily deaths and the number of cumulative deaths in the United States (US) due to COVID-19. To examine their predictive accuracy in light of what actually happened, daily deaths data until May 15, 2020 were used to forecast cumulative deaths by June 1, 2020. It was observed that there is over-dispersion in the observed data relative to the Poisson regression model. A novel over-dispersed Poisson regression model is therefore proposed. This new model, which is distinct from the negative binomial regression (NBR) model, builds on frailty ideas in Survival Analysis and over-dispersion is quantified through an additional parameter. It has the flavor of a discrete measurement error model and with a viable physical interpretation in contrast to the NBR model. The Poisson regression model is a hidden model in this over-dispersed Poisson regression model, obtained as a limiting case when the over-dispersion parameter increases to infinity. A prediction region for the cumulative number of US deaths due to COVID-19 by October 1, 2020, given the data until September 1, 2020, is presented. Realized daily and cumulative deaths values from September 1st until September 25th are compared to the prediction region limits. Finally, the paper discusses limitations of the proposed procedures and mentions open research problems. It also pinpoints dangers and pitfalls when forecasting on a long horizon, especially during a pandemic where events, both foreseen and unforeseen, could impact point predictions and prediction regions.
Publisher
Walter de Gruyter GmbH
Reference45 articles.
1. Allen, L. J. (2017), “A primer on stochastic epidemic models: Formulation, numerical simulation, and analysis,” Infectious Disease Modelling, 2, 128–142.10.1016/j.idm.2017.03.001 2. Andersen, P., Borgan, O., Gill, R., and Keiding, N. (1993), Statistical Models Based on Counting Processes, New York: Springer-Verlag.10.1007/978-1-4612-4348-9 3. Ash, J. E., Zou, Y., Lord, D., and Wang, Y. (2021), “Comparison of confidence and prediction intervals for different mixed-Poisson regression models,” Journal of Transportation Safety & Security, 13, 357–379.10.1080/19439962.2019.1638475 4. Bain, L. J., and Patel, J. K. (1993), “Prediction intervals based on partial observations for some discrete distributions,” IEEE Transactions on Reliability, 42, 459–463.10.1109/24.257831 5. BBC (2020a), “Coronavirus: What is the true death toll of the pandemic?” https://www.bbc.com/news/world-53073046.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|