Dalton-Reference-Cited by-同舟云学术

Dalton

Published:2022-11 Issue:3 Volume:16 Page:491-504
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Zapridou Eleni¹,Mytilinis Ioannis²,Ailamaki Anastasia¹

Affiliation:

1. EPFL

2. Oracle

Abstract

To sustain the input rate of high-throughput streams, modern stream processing systems rely on parallel execution. However, skewed data yield imbalanced load assignments and create stragglers that hinder scalability Deciding on a static partitioning for a given set of "hot" keys is not sufficient as these keys are not known in advance, and even worse, the data distribution can change unpredictably. Existing algorithms either optimize for a specific distribution or, in order to adapt, assume a centralized partitioner that processes every incoming tuple and observes the whole workload. However, this is not realistic in a distributed environment, where multiple parallel upstream operators exist, as the centralized partitioner itself becomes the bottleneck and limits scalability In this work, we propose Dalton: a lightweight, adaptive, yet scalable partitioning operator that relies on reinforcement learning. By memoizing state and dynamically keeping track of recent experience, Dalton: i) adjusts its policy at runtime and quickly adapts to the workload, ii) avoids redundant computations and minimizes the per-tuple partitioning overhead, and iii) efficiently scales out to multiple instances that learn cooperatively and converge to a joint policy Our experiments indicate that Dalton scales regardless of the input data distribution and sustains 1.3X - 6.7X higher throughput than existing approaches.

Publisher

Association for Computing Machinery (ACM)

Subject

General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development

Link

https://dl.acm.org/doi/pdf/10.14778/3570690.3570699

Reference44 articles.

1. PartLy

2. Prompt: Dynamic Data-Partitioning for Distributed Micro-batch Stream Processing Systems

3. The dataflow model

4. Online scheduling of dependent tasks of cloud’s workflows to enhance resource utilization and reduce the makespan using multiple reinforcement learning-based agents

5. Task scheduling, resource provisioning, and load balancing on scientific workflows using parallel SARSA reinforcement learning agents and genetic algorithm

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. FlexSP:(1 + β)-Choice based Flexible Stream Partitioning for Stateful Operators;Proceedings of the 53rd International Conference on Parallel Processing;2024-08-12

2. Last Night in Sweden: A Vision for Resource-Intelligent Stream Reasoning;Proceedings of the 18th ACM International Conference on Distributed and Event-based Systems;2024-06-24

3. GLO: Towards Generalized Learned Query Optimization;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

4. ZeroTune: Learned Zero-Shot Cost Models for Parallelism Tuning in Stream Processing.;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

5. Adaptive key partitioning in distributed stream processing;CCF Transactions on High Performance Computing;2024-01-12