Abstract
AbstractMissing data reconstruction is a critical step in the analysis and mining of spatio-temporal data. However, few studies comprehensively consider missing data patterns, sample selection and spatio-temporal relationships. To take into account the uncertainty in the point forecast, some prediction intervals may be of interest. In particular, for (possibly long) missing sequences of consecutive time points, joint prediction regions are desirable. In this paper we propose a bootstrap resampling scheme to construct joint prediction regions that approximately contain missing paths of a time components in a spatio-temporal framework, with global probability $$1-\alpha $$
1
-
α
. In many applications, considering the coverage of the whole missing sample-path might appear too restrictive. To perceive more informative inference, we also derive smaller joint prediction regions that only contain all elements of missing paths up to a small number k of them with probability $$1-\alpha $$
1
-
α
. A simulation experiment is performed to validate the empirical performance of the proposed joint bootstrap prediction and to compare it with some alternative procedures based on a simple nominal coverage correction, loosely inspired by the Bonferroni approach, which are expected to work well standard scenarios.
Funder
Università degli Studi di Salerno
Publisher
Springer Science and Business Media LLC
Subject
Computational Mathematics,Statistics, Probability and Uncertainty,Statistics and Probability
Reference24 articles.
1. Alonso AM, Sipols AE (2008) A time series bootstrap procedure for interpolation intervals. Comput Stat Data Anal 52:1792–1805
2. Alonso AM, Sipols AE, Quintas S (2013) A single-index model procedure for interpolation intervals in time series. Comput Stat 28:1463–1484
3. Atluri G, Karpatne A, Kumar V (2018) Spatio-temporal data mining: a survey of problems and methods. ACM Comput Surv 51(4):1–41
4. Calculli C, Fassò A, Finazzi F, Pollice A, Turnone A (2015) Maximum likelihood estimation of the multivariate hidden dynamic geostatistical model with application to air quality in Apulia, Italy. Environmetrics 26:406–417
5. Cano S, Andreu J (2010) Using multiple imputation to simulate time series: a proposal to solve the distance effect. WSEAS Trans Comput 9(7):768–777