Affiliation:
1. School of Mathematics and Statistics and UCD Earth Institute University College Dublin Dublin Ireland
Abstract
AbstractAccurate precipitation records are an essential component when monitoring the climate and studying its changes. However, analysis is typically limited by the large quantities of missing values present. This article proposes two new imputation techniques for incomplete monthly data collected from a rainfall monitoring network in the Republic of Ireland from 1981 to 2010. The data considered is high‐dimensional due to the large number of over 1100 rain gauge stations present, and the methods presented are designed to handle such cases. These are Elastic‐Net Chained Equations (ENCE) and Multiple Imputation by Chained Equations with Direct use of Regularized Regression by elastic‐net (MICE DURR). Both methods predict missing data by a series of regularized regression models, where MICE DURR differs from ENCE by also using multiple imputation. Through various evaluations across different levels of missingness, ENCE and MICE DURR consistently outperformed existing imputation methods in terms of RMSE and . Moreover, they have provided the best results both seasonally and for accurately predicting extreme values. An RMSE of 14.16 and 14.17 mm per month were reported for ENCE and MICE DURR, respectively, when stations that were at least 50% complete during the study period were included. For increasingly sparser data, the imputation accuracy achieved from MICE DURR surpasses ENCE, demonstrating the efficacy of multiple imputation when handling a substantial amount of missing data. Validation metrics indicate that these methods compare very favourably to existing methods in the literature, such as those that use random forests or multiple linear regression.
Funder
Science Foundation Ireland
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献