CIVET: Exploring Compact Index for Variable-Length Subsequence Matching on Time Series-Reference-Cited by-同舟云学术

CIVET: Exploring Compact Index for Variable-Length Subsequence Matching on Time Series

Published:2024-05 Issue:9 Volume:17 Page:2123-2135
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Xiong Haoran¹,Zhang Hang¹,Wang Zeyu¹,He Zhenying¹,Wang Peng¹,Wang X. Sean¹

Affiliation:

1. Fudan University

Abstract

Nowadays the demands for managing and analyzing substantially increasing collections of time series are becoming more challenging. Subsequence matching, as a core subroutine in time series analysis, has drawn significant research attention. Most of the previous works only focus on matching the subsequences with equal length to the query. However, many scenarios require support for efficient variable-length subsequence matching. In this paper, we propose a new representation, Uniform Piecewise Aggregate Approximation (UPAA) with the capability of aligning features for variable-length time series while remaining the lower bounding property. Based on UPAA, we present a compact index structure by grouping adjacent subsequences and similar subsequences respectively. Moreover, we propose an index pruning algorithm and a data filtering strategy to efficiently support variable-length subsequence matching without false dismissals. The experiments conducted on both real and synthetic datasets demonstrate that our approach achieves considerably better efficiency, scalability, and effectiveness than existing approaches.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.14778/3665844.3665845

Reference42 articles.

1. Steven B. Achelis. 2001. Technical Analysis from A to Z (2nd ed.). McGraw Hill Professional.

2. Alessandro Camerra, Themis Palpanas, Jin Shieh, and Eamonn Keogh. 2010. iSAX 2.0: Indexing and Mining One Billion Time Series. In Proceedings of the 2010 IEEE International Conference on Data Mining (ICDM). IEEE Computer Society, 58--67.

3. Alessandro Camerra, Jin Shieh, Themis Palpanas, Thanawin Rakthanmanon, and Eamonn Keogh. 2014. Beyond one billion time series: indexing and mining very large time series collections with isax2+. Knowledge and information systems 39, 1 (2014), 123--151.

4. The inherent time complexity and an efficient algorithm for subsequence matching problem;Chao Zemin;Proceedings of the VLDB Endowment,2022

5. Efficient Range and kNN Twin Subsequence Search in Time Series;Chatzigeorgakidis Georgios;IEEE Transactions on Knowledge and Data Engineering,2023