Affiliation:
1. Lawrence Livermore National Laboratory, Livermore, CA, USA
2. University of Tennessee, Knoxville, TN, USA
Abstract
High performance computing (HPC) workflows are undergoing tumultuous changes, including an explosion in size and complexity. Despite these changes, most batch job systems still use slow, centralized schedulers. Generalized hierarchical scheduling (GHS) solves many of the challenges that face modern workflows, but GHS has not been widely adopted in HPC. A major difficulty that hinders adoption is the lack of a performance model to aid in configuring GHS for optimal performance on a given application. We propose an analytical performance model of GHS, and we validate our proposed model with four different applications on a moderately-sized system. Our validation shows that our model is extremely accurate at predicting the performance of GHS, explaining 98.7% of the variance (i.e., an R2 statistic of 0.987). Our results also support the claim that GHS overcomes scheduling throughput problems; we measured throughput improvements of up to 270× on our moderately-sized system. We then apply our performance model to a pre-exascale system, where our model predicts throughput improvements of four orders of magnitude and provides insight into optimally configuring GHS on next generation systems.
Subject
Hardware and Architecture,Theoretical Computer Science,Software
Reference57 articles.
1. Flux: Overcoming scheduling challenges for exascale workflows
2. Parsl
3. The Legion Resource Management System
4. Computational data analysis workflow systems (2020). https://github.com/common-workflow-language/common-workflow-language/wiki/Existing-Workflow-systems (Retrieved April 19, 2020).
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Fluxion: A Scalable Graph-Based Resource Model for HPC Scheduling Challenges;Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis;2023-11-12
2. Reproducing and Extending Analytical Performance Models of Generalized Hierarchical Scheduling;2022 IEEE 18th International Conference on e-Science (e-Science);2022-10