Affiliation:
1. Columbia University, New York, USA
2. University of Texas at Austin, Austin, USA
3. University of Illinois at Urbana-Champaign, Urbana, USA
Abstract
Motivated by emerging big
streaming data
processing paradigms (e.g., Twitter Storm, Streaming MapReduce), we investigate the problem of scheduling graphs over a large cluster of servers. Each graph is a job, where nodes represent compute tasks and edges indicate data-flows between these compute tasks. Jobs (graphs) arrive randomly over time, and upon completion, leave the system. When a job arrives, the scheduler needs to partition the graph and distribute it over the servers to satisfy load balancing and cost considerations. Specifically, neighboring compute tasks in the graph that are mapped to different servers incur load on the network; thus a mapping of the jobs among the servers incurs a cost that is proportional to the number of "broken edges''. We propose a low complexity randomized scheduling algorithm that, without service preemptions, stabilizes the system with graph arrivals/departures; more importantly, it allows a smooth trade-off between minimizing average partitioning cost and average queue lengths. Interestingly, to avoid service preemptions, our approach does not rely on a Gibbs sampler; instead, we show that the corresponding limiting invariant measure has an interpretation stemming from a loss system.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications,Hardware and Architecture,Software
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Delay-Optimal Distributed Computation Offloading in Wireless Edge Networks;IEEE/ACM Transactions on Networking;2024-08
2. Latency-Optimal Pyramid-based Joint Communication and Computation Scheduling for Distributed Edge Computing;IEEE INFOCOM 2023 - IEEE Conference on Computer Communications;2023-05-17
3. S2CE;Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems;2021-06-28
4. Incentive Mechanisms for Resource Scaling-out Game of Stream Big Data Analytics;Journal of Grid Computing;2018-09-06