Abstract
AbstractProcessing and analyzing time series datasets have become a central issue in many domains requiring data management systems to support time series as a native data type. A core access primitive of time series is matching, which requires efficient algorithms on-top of appropriate representations like the symbolic aggregate approximation (SAX) representing the current state of the art. This technique reduces a time series to a low-dimensional space by segmenting it and discretizing each segment into a small symbolic alphabet. Unfortunately, SAX ignores the deterministic behavior of time series such as cyclical repeating patterns or a trend component affecting all segments, which may lead to a sub-optimal representation accuracy. We therefore introduce a novel season- and a trend-aware symbolic approximation and demonstrate an improved representation accuracy without increasing the memory footprint. Most importantly, our techniques also enable a more efficient time series matching by providing a match up to three orders of magnitude faster than SAX.
Funder
Technische Universität Dresden
Publisher
Springer Science and Business Media LLC
Subject
General Earth and Planetary Sciences,General Environmental Science
Reference18 articles.
1. Agrawal R, Faloutsos C, Swami A (1993) Efficient similarity search in sequence databases. FODO 730:69–84
2. Butler M, Kazakov D (2015) SAX discretization does not guarantee equiprobable symbols. IKDE 27(4):1162–1166. https://doi.org/10.1109/TKDE.2014.2382882
3. Chen Q, Chen L, Lian X, Liu Y, Yu JX (2007) Indexable PLA for efficient similarity search. In: Proc. of VLDB, pp 435–446
4. Kendall MG, Stuart A (1983) The advanced theory of statistics vol 3. Griffin, , pp 410–414
5. Lin J, Keogh EJ, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Workshop Proc. of SIGMOD, pp 2–11 https://doi.org/10.1145/882082.882086
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献