Affiliation:
1. IBM T. J. Watson Research Center
2. IBM Analytics Platform
3. Sandia National Labs
Abstract
Guaranteed tuple processing has become critically important for many streaming applications. This paper describes how we enabled IBM Streams, an enterprise-grade stream processing system, to provide data processing guarantees. Our solution goes from language-level abstractions to a runtime protocol. As a result, with a couple of simple annotations at the source code level, IBM Streams developers can define
consistent regions
, allowing any subgraph of their streaming application to achieve guaranteed tuple processing. At runtime, a consistent region periodically executes a variation of the Chandy-Lamport snapshot algorithm to establish a consistent global state for that region. The coupling of consistent states with data replay enables guaranteed tuple processing.
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Cited by
20 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. CheckMate: Evaluating Checkpointing Protocols for Streaming Dataflows;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13
2. A survey on the evolution of stream processing systems;The VLDB Journal;2023-11-22
3. Fault Tolerance of Stateful Microservices for Industrial Edge Scenarios;2023 IEEE International Conference on Joint Cloud Computing (JCC);2023-07
4. Bounding substreams in distributed stream processing;Information Systems;2023-07
5. Substream management in distributed streaming dataflows;Proceedings of the 16th ACM International Conference on Distributed and Event-Based Systems;2022-06-27