Abstract
AbstractIn this work, we address an online job scheduling problem in a large distributed computing environment. Each job has a priority and a demand of resources, takes an unknown amount of time, and is malleable, i.e., the number of allotted workers can fluctuate during its execution. We subdivide the problem into (a) determining a fair amount of resources for each job and (b) assigning each job to an according number of processing elements. Our approach is fully decentralized, uses lightweight communication, and arranges each job as a binary tree of workers which can grow and shrink as necessary. Using the NP-complete problem of propositional satisfiability (SAT) as a case study, we experimentally show on up to 128 machines (6144 cores) that our approach leads to near-optimal utilization, imposes minimal computational overhead, and performs fair scheduling of incoming jobs within a few milliseconds.
Publisher
Springer International Publishing
Reference27 articles.
1. Ajtai, M., Komlós, J., Szemerédi, E.: Sorting in $$\log n$$ parallel steps. Combinatorica 3(1), 1–19 (1983). https://doi.org/10.1109/tc.1985.5009385
2. Alquraan, A., Takruri, H., Alfatafta, M., Al-Kiswany, S.: An analysis of network-partitioning failures in cloud systems. In: Symposium on Operating Systems Design and Implementation, pp. 51–68 (2018)
3. Audemard, G., Simon, L.: Predicting learnt clauses quality in modern SAT solvers. In: International Joint Conference on Artificial Intelligence, pp. 399–404 (2009)
4. Axtmann, M., Sanders, P.: Robust massively parallel sorting. In: Meeting on Algorithm Engineering and Experiments (ALENEX), pp. 83–97 (2017). https://doi.org/10.1137/1.9781611974768.7
5. Blazewicz, J., Kovalyov, M.Y., Machowiak, M., Trystram, D., Weglarz, J.: Preemptable malleable task scheduling problem. IEEE Trans. Comput. 55(4), 486–490 (2006). https://doi.org/10.1109/tc.2006.58
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献