Affiliation:
1. Imperial College London
Abstract
Single-node multi-core stream processing engines (SPEs) can process hundreds of millions of tuples per second. Yet making them fault-tolerant with exactly-once semantics while retaining this performance is an open challenge: due to the limited I/O bandwidth of a single-node, it becomes infeasible to persist all stream data and operator state during execution. Instead, single-node SPEs rely on upstream distributed systems, such as Apache Kafka, to recover stream data after failure, necessitating complex cluster-based deployments. This lack of built-in fault-tolerance features has hindered the adoption of single-node SPEs.
We describe Scabbard, the first single-node SPE that supports exactly-once fault-tolerance semantics despite limited local I/O bandwidth. Scabbard achieves this by integrating persistence operations with the query workload. Within the operator graph, Scabbard determines when to persist streams based on the selectivity of operators: by persisting streams after operators that discard data, it can substantially reduce the required I/O bandwidth. As part of the operator graph, Scabbard supports parallel persistence operations and uses markers to decide when to discard persisted data. The persisted data volume is further reduced using workload-specific compression: Scabbard monitors stream statistics and dynamically generates computationally efficient compression operators. Our experiments show that Scabbard can execute stream queries that process over 200 million tuples per second while recovering from failures with sub-second latencies.
Publisher
Association for Computing Machinery (ACM)
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Reference110 articles.
1. Integrating compression and execution in column-oriented database systems
2. MillWheel
3. Amazon. 2021. Amazon Elastic Block Store. https://aws.amazon.com/ebs/. Last access: 28/10/21. Amazon. 2021. Amazon Elastic Block Store. https://aws.amazon.com/ebs/. Last access: 28/10/21.
4. Amazon. 2021. Amazon Kinesis. https://aws.amazon.com/kinesis/data-streams/. Last access: 28/10/21. Amazon. 2021. Amazon Kinesis. https://aws.amazon.com/kinesis/data-streams/. Last access: 28/10/21.
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Data-Aware Adaptive Compression for Stream Processing;IEEE Transactions on Knowledge and Data Engineering;2024-09
2. DIBA: A Re-Configurable Stream Processor;IEEE Transactions on Knowledge and Data Engineering;2024-09
3. A Comprehensive Benchmarking Analysis of Fault Recovery in Stream Processing Frameworks;Proceedings of the 18th ACM International Conference on Distributed and Event-based Systems;2024-06-24
4. Safe Shared State in Dataflow Systems;Proceedings of the 18th ACM International Conference on Distributed and Event-based Systems;2024-06-24
5. Checkpointing models for tasks of different types;ACM Transactions on Modeling and Performance Evaluation of Computing Systems;2024-05-21