ForestTI: A Scalable Inverted-Index-Oriented Timeseries Management System with Flexible Memory Efficiency-Reference-Cited by-同舟云学术

ForestTI: A Scalable Inverted-Index-Oriented Timeseries Management System with Flexible Memory Efficiency

Published:2023-06-13 Issue:2 Volume:1 Page:1-25
ISSN:2836-6573
Container-title:Proceedings of the ACM on Management of Data
language:en
Short-container-title:Proc. ACM Manag. Data

Author:

Wang Zhiqi¹^ORCID,Shao Zili¹^ORCID

Affiliation:

1. The Chinese University of Hong Kong, Hong Kong, China

Abstract

Timeseries management systems play an important role in IoT and performance monitoring. As the data volume scales up, absorbing data memory efficiently with high throughput becomes a growing requirement for timeseries management systems. However, the designs of the existing systems, especially the in-memory data structures, suffer from two issues. First, they suffer from the trade-off between memory efficiency and performance. Second, they are not scalable because of lock contention where they cannot benefit from parallel insertion and querying. In this paper, we propose ForestTI, a scalable inverted-index-oriented timeseries management system where the balance point between memory efficiency and performance can be flexibly adjusted under the increasing memory pressure. First, we present a two-level inverted index, which is scalable with optimistic lock coupling, and its internal structure can be gradually converted to more memory efficient representations. Second, we propose a two-level pointer swizzling mechanism to actively swap out the cold posting lists and in-memory timeseries objects as the number of timeseries increases. Finally, we further optimize the on-disk data structures (i.e. write-ahead logs and LSM-tree) to adapt to the high insertion throughput from the in-memory components. We prototype ForestTI with C++ from scratch, and compared to the storage engine of Prometheus, ForestTI achieves 1.79x higher insertion throughput, 52.1% lower query latency, and 56.9% lower memory occupation. We have released the open-source code of ForestTI for public access.

Funder

Direct Grant for Research, The Chinese University of Hong Kong

the Research Grants Council of the Hong Kong Special Administrative Region, China

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3589260

Reference46 articles.

1. 2022. CPU cores and threads per CPU core per instance type. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/cpu-options-supported-instances-values.html. 2022. CPU cores and threads per CPU core per instance type. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/cpu-options-supported-instances-values.html.

2. 2022. Grafana: The open observability platform | Grafana Labs. https://grafana.com/. 2022. Grafana: The open observability platform | Grafana Labs. https://grafana.com/.

3. 2022. mlock(2) - Linux manual page. https://man7.org/linux/man-pages/man2/mlock.2.html. 2022. mlock(2) - Linux manual page. https://man7.org/linux/man-pages/man2/mlock.2.html.

4. 2022. mmap(2) - Linux manual page. https://www.man7.org/linux/man-pages/man2/mmap.2.html. 2022. mmap(2) - Linux manual page. https://www.man7.org/linux/man-pages/man2/mmap.2.html.

5. 2022. Transaction ID Wraparound in Postgres. https://blog.sentry.io/2015/07/23/transaction-id-wraparound-in-postgres. 2022. Transaction ID Wraparound in Postgres. https://blog.sentry.io/2015/07/23/transaction-id-wraparound-in-postgres.