Multiple Imputation Ensembles for Time Series (MIE-TS)-Reference-Cited by-同舟云学术

Multiple Imputation Ensembles for Time Series (MIE-TS)

Published:2023-02-22 Issue:3 Volume:17 Page:1-28
ISSN:1556-4681
Container-title:ACM Transactions on Knowledge Discovery from Data
language:en
Short-container-title:ACM Trans. Knowl. Discov. Data

Author:

Aleryani Aliya¹^ORCID,Bostrom Aaron¹^ORCID,Wang Wenjia¹^ORCID,Iglesia Beatriz¹^ORCID

Affiliation:

1. University of East Anglia, Norwich, Norfolk, UK

Abstract

Time series classification has become an interesting field of research, thanks to the extensive studies conducted in the past two decades. Time series may have missing data, which may affect both the representation and also modeling of time series. Thus, recovering missing data using appropriate time series-based imputation methods is an essential step. Multiple imputation is a data recovery method where it produced multiple imputed data. The method proves its usefulness in terms of reflecting the uncertainty inherit in missing data; however, it is under-researched in time series problems. In this article, we propose two multiple imputation approaches for time series. The first is a multiple imputation method based on interpolation. The second is a multiple imputation and ensemble method. First, we simulate missing consecutive sub-sequences under a Missing Completely at Random mechanism; then, we use single/multiple imputation methods. The imputed data are used to build bagging and stacking ensembles. We build ensembles using standard classification algorithms as well as time series classifiers. The standard classifiers involve Random Forest, Support Vector Machines, K-Nearest Neighbour, C4.5, and PART while TSCHIEF, Proximity Forest, Time Series Forest, RISE, and BOSS are chosen as time series classifiers. Our findings show that the combination of multiple imputation and ensemble improves the performance of the majority of classifiers tested in this study, often above the performance obtained from the complete data, even under increasing missing data scenarios. This may be because the diversity injected by multiple imputation has a very favourable and stabilising effect on the classifier performance, which is a very important finding.

Funder

The Business and Local Government Data Research Centre

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3551643

Reference57 articles.

1. W. Vickers A. Bagnall J. Lines and E. Keogh. 2021. The UEA & UCR Time Series Classification Repository. Retrieved from http://www.timeseriesclassification.com.

2. Aliya Aleryani. 2021. Simulation of Missing Data. Retrieved from https://github.com/AliyaAleryani/Simulation-of-Missing-Data.git.

3. Aliya Aleryani, Wenjia Wang, and Beatriz De La Iglesia. 2018. Dealing with missing data and uncertainty in the context of data mining. In Proceedings of the International Conference on Hybrid Artificial Intelligence Systems. Springer, 289–301.

4. Multiple Imputation Ensembles (MIE) for Dealing with Missing Data

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Weighted Average Ensemble-Based PV Forecasting in a Limited Environment with Missing Data of PV Power;Sustainability;2024-05-13

2. The Impact of Imputation Methods on the Classification of Household Devices from Electricity Usage Time Series;2023 Tenth International Conference on Social Networks Analysis, Management and Security (SNAMS);2023-11-21

3. Multivariate Imputation by N Neighbour Mean and Chained Equation for Time Series Missing Data;2023 IEEE 2nd International Conference on Industrial Electronics: Developments & Applications (ICIDeA);2023-09-29