Affiliation:
1. Technische Universität München
2. CedarDB
Abstract
Businesses are increasingly demanding real-time analytics on up-to-date data. However, current solutions fail to efficiently combine transactional and analytical processing in a single system. Instead, they rely on extract-transform-load pipelines to transfer transactional data to analytical systems, which introduces a significant delay in the time-to-insight. In this paper, we address this need by proposing a new storage engine design for the cloud, called
Colibri
, that enables hybrid transactional and analytical processing beyond main memory. Colibri features a hybrid column-row store optimized for both workloads, leveraging emerging hardware trends. It effectively separates hot and cold data to accommodate diverse access patterns and storage devices. Our extensive experiments showcase up to 10x performance improvements for processing hybrid workloads on solid-state drives and cloud object stores.
Publisher
Association for Computing Machinery (ACM)
Reference76 articles.
1. 2023. Apache ORC. Retrieved July 1 2024 from https://orc.apache.org/
2. 2024. Apache Iceberg. Retrieved July 1 2024 from https://iceberg.apache.org/
3. 2024. Apache Parquet. Retrieved July 1 2024 from https://parquet.apache.org/
4. 2024. MySQL HeatWave. Retrieved July 1 2024 from https://www.oracle.com/mysql/heatwave/
5. Integrating compression and execution in column-oriented database systems