Investigation of using missing data imputation methodologies effect on the SARIMA model performance: application to average monthly flows

Author:

Bleidorn Michel Trarbach1ORCID,Schmidt Isamara Maria1ORCID,Reis José Antonio Tosta dos1ORCID,Pani Deysilara Figueira1ORCID,Pinto Wanderson de Paula2ORCID,Solci Carlo Corrêa1ORCID,Mendonça Antonio Sergio Ferreira1ORCID,Brasil Gutemberg Hespanha1ORCID

Affiliation:

1. Universidade Federal do Espírito Santo, Brasil

2. Centro Universitário FAVENI, Brasil

Abstract

ABSTRACT Accuracy in river flows forecasts is crucial for Hydrology, but is challenged by fluviometric data quality. This study investigates the impact of different missing data imputation methods on the Seasonal Autoregressive Integrated Moving Average (SARIMA) model performance. SARIMA (1,1,1)(0,1,1)12 was selected using semi-automated criteria, such as lowest AIC, significant parameters (p-value < 0.05) and residuals adequacy. This model was then compared with reconstructed series using different imputation methods such as Mean (AM), Median (M), Spline and Stinemann Interpolations, Regional Weighting (RW), Multiple Linear Regression (MLR), Multiple Imputation (MI) and Maximum Likelihood (ML). The data were analyzed considering scenarios of 5, 20 and 40% missing data, following random and block patterns, using data from the Doce River, in Southeast Brazil. Results obtained by the performance indicators and, their respective relative differences, indicated that, univariate (AM and M) and multivariate (PW and RLM) methods limited the model's performance, while univariate Spline and Stine and multivariate IM and ML methods didn't present significant limitations, except Spline for the block pattern. It is concluded that, future predictions accuracy depends, not only on a well-trained and validated model, but also on the appropriate use of missing data imputation methods.

Publisher

FapUNIFESP (SciELO)

Reference46 articles.

1. Comparison of performance of statistical models in forecasting monthly streamflow of Kizil River, China;Abudu S.;Water Science and Engineering,2010

2. Hydrological drought forecasting using multi-scalar streamflow drought index, stochastic models and machine learning approaches, in northern Iran;Aghelpour P.;Stochastic Environmental Research and Risk Assessment,2021

3. Developing monthly hydrometeorological timeseries forecasts to reservoir operation in a transboundary river catchment;Ahmad I.;Theoretical and Applied Climatology,2022

4. A new look at the statistical model identification;Akaike H.;IEEE Transactions on Automatic Control,1974

5. Selected papers of Hirotugu Akaike;Akaike H.,1978

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3