ForestTI: A Scalable Inverted-Index-Oriented Timeseries Management System with Flexible Memory Efficiency

Author:

Wang Zhiqi1ORCID,Shao Zili1ORCID

Affiliation:

1. The Chinese University of Hong Kong, Hong Kong, China

Abstract

Timeseries management systems play an important role in IoT and performance monitoring. As the data volume scales up, absorbing data memory efficiently with high throughput becomes a growing requirement for timeseries management systems. However, the designs of the existing systems, especially the in-memory data structures, suffer from two issues. First, they suffer from the trade-off between memory efficiency and performance. Second, they are not scalable because of lock contention where they cannot benefit from parallel insertion and querying. In this paper, we propose ForestTI, a scalable inverted-index-oriented timeseries management system where the balance point between memory efficiency and performance can be flexibly adjusted under the increasing memory pressure. First, we present a two-level inverted index, which is scalable with optimistic lock coupling, and its internal structure can be gradually converted to more memory efficient representations. Second, we propose a two-level pointer swizzling mechanism to actively swap out the cold posting lists and in-memory timeseries objects as the number of timeseries increases. Finally, we further optimize the on-disk data structures (i.e. write-ahead logs and LSM-tree) to adapt to the high insertion throughput from the in-memory components. We prototype ForestTI with C++ from scratch, and compared to the storage engine of Prometheus, ForestTI achieves 1.79x higher insertion throughput, 52.1% lower query latency, and 56.9% lower memory occupation. We have released the open-source code of ForestTI for public access.

Funder

Direct Grant for Research, The Chinese University of Hong Kong

the Research Grants Council of the Hong Kong Special Administrative Region, China

Publisher

Association for Computing Machinery (ACM)

Reference46 articles.

1. 2022. CPU cores and threads per CPU core per instance type. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/cpu-options-supported-instances-values.html. 2022. CPU cores and threads per CPU core per instance type. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/cpu-options-supported-instances-values.html.

2. 2022. Grafana: The open observability platform | Grafana Labs. https://grafana.com/. 2022. Grafana: The open observability platform | Grafana Labs. https://grafana.com/.

3. 2022. mlock(2) - Linux manual page. https://man7.org/linux/man-pages/man2/mlock.2.html. 2022. mlock(2) - Linux manual page. https://man7.org/linux/man-pages/man2/mlock.2.html.

4. 2022. mmap(2) - Linux manual page. https://www.man7.org/linux/man-pages/man2/mmap.2.html. 2022. mmap(2) - Linux manual page. https://www.man7.org/linux/man-pages/man2/mmap.2.html.

5. 2022. Transaction ID Wraparound in Postgres. https://blog.sentry.io/2015/07/23/transaction-id-wraparound-in-postgres. 2022. Transaction ID Wraparound in Postgres. https://blog.sentry.io/2015/07/23/transaction-id-wraparound-in-postgres.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3