Time series data encoding for efficient storage-Reference-Cited by-同舟云学术

Time series data encoding for efficient storage

Published:2022-06 Issue:10 Volume:15 Page:2148-2160
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Xiao Jinzhao¹,Huang Yuxiang¹,Hu Changyu¹,Song Shaoxu¹,Huang Xiangdong¹,Wang Jianmin¹

Affiliation:

1. Tsinghua University

Abstract

Not only the vast applications but also the distinct features of time series data stimulate the booming growth of time series database management systems, such as Apache IoTDB, InfluxDB, OpenTSDB and so on. Almost all these systems employ columnar storage, with effective encoding of time series data. Given the distinct features of various time series data, it is not surprising that different encoding strategies may perform variously. In this study, we first summarize the features of time series data that may affect encoding performance, including scale, delta, repeat and increase. Then, we introduce the storage scheme of a typical time series database, Apache IoTDB, prescribing the limits to implementing encoding algorithms in the system. A qualitative analysis of encoding effectiveness regarding to various data features is then presented for the studied algorithms. To this end, we develop a benchmark for evaluating encoding algorithms, including a data generator regarding the aforesaid data features and several real-world datasets from our industrial partners. Finally, we present an extensive experimental evaluation using the benchmark. Remarkably, a quantitative analysis of encoding effectiveness regarding to various data features is conducted in Apache IoTDB.

Publisher

Association for Computing Machinery (ACM)

Subject

General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development

Link

https://dl.acm.org/doi/pdf/10.14778/3547305.3547319

Reference48 articles.

1. https://iotdb.apache.org/. https://iotdb.apache.org/.

2. https://www.influxdata.com/. https://www.influxdata.com/.

3. http://opentsdb.net/. http://opentsdb.net/.

4. https://prometheus.io/. https://prometheus.io/.

5. https://github.com/apache/iotdb/tree/research/encoding-exp. https://github.com/apache/iotdb/tree/research/encoding-exp.

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. On Tuning Raft for IoT Workload in Apache IoTDB;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

2. Joint Directory, File and IO Trace Feature Extraction and Feature-based Trace Regeneration for Enterprise Storage Systems;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

3. REGER: Reordering Time Series Data for Regression Encoding;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

4. Cocv: A compression algorithm for time-series data with continuous constant values in IoT-based monitoring systems;Internet of Things;2024-04

5. Time Series Representation for Visualization in Apache IoTDB;Proceedings of the ACM on Management of Data;2024-03-12