Handling missing data methods for estimating the incidence rate of COVID-19 pandemic data: A case study in Vietnam (Preprint)

Author:

Pham Hai-ThanhORCID,Do-Thi Thanh-Toan,Baek Jonggyu,Nguyen Cong-Khanh,Pham Quang-Thai,Nguyen Hoa L,Goldberg Robert J,Pham Loc Quang,Le Giang Minh

Abstract

BACKGROUND

The COVID-19 pandemic, characterized by varying lockdown durations across different nations and overcrowding in healthcare facilities, has introduced novel challenges in the realm of disease forecasting. One of the pressing issues has been the management of missing data stemming from diverse sources

OBJECTIVE

To show how handling missing data can effect estimates of the COVID-19 incidence rate (CIR).

METHODS

The current study used data from the surveillance system of COVID-19/SAR-CoV-2 patients treated at the National Institute of Hygiene and Epidemiology, Hanoi, Vietnam. We randomly removed missing data that were completely at random (MCAR) from 5% to 30% with a break of 5% each time in the variable daily case load of COVID-19. We selected six analytical methods to assess the effects of handling missing data which were backfill imputation, moving average, median imputation, maximum likelihood, linear interpolation, and the Autoregressive integrated moving average (ARIMA) model.

RESULTS

During the Zero-COVID period, the median imputation method yielded lower mean absolute crude bias (ACB) and mean crude root mean square error (RMSE) values compared to the other methods, irrespective of the extent of missing data; the median imputation method exhibited the lowest mean absolute percentage change (APC) in the CIR. During the Transition period, the ARIMA model of imputation demonstrated the lowest mean ACB across all levels of missing data and the lowest mean APC values. During the New-normal period, the backfill and linear interpolation methods demonstrated the lowest mean ACB across all levels of missing data and relatively lower mean APC values compared with the other imputation methods.

CONCLUSIONS

Our study emphasizes the importance of choosing the most appropriate missing data handling method, in the context of a specific disease situation, to ensure reliable estimates of the CIR.

Publisher

JMIR Publications Inc.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3