Performance-Aware Scheduling of Parallel Applications on Non-Dedicated Clusters-Reference-Cited by-同舟云学术

Performance-Aware Scheduling of Parallel Applications on Non-Dedicated Clusters

Published:2019-09-02 Issue:9 Volume:8 Page:982
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Cascajo Alberto,Singh David E.,Carretero Jesus

Abstract

This work presents a HPC framework that provides new strategies for resource management and job scheduling, based on executing different applications in shared compute nodes, maximizing platform utilization. The framework includes a scalable monitoring tool that is able to analyze the platform’s compute node utilization. We also introduce an extension of CLARISSE, a middleware for data-staging coordination and control on large-scale HPC platforms that uses the information provided by the monitor in combination with application-level analysis to detect performance degradation in the running applications. This degradation, caused by the fact that the applications share the compute nodes and may compete for their resources, is avoided by means of dynamic application migration. A description of the architecture, as well as a practical evaluation of the proposal, shows significant performance improvements up to 20% in the makespan and 10% in energy consumption compared to a non-optimized execution.

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Link

https://www.mdpi.com/2079-9292/8/9/982/pdf

Reference41 articles.

1. Hybrid Job Scheduling for Improved Cluster Utilization;Ari,2014

2. Slurm: Simple linux utility for resource management;Yoo,2003

3. Reducing communication costs in collective I/O in multi-core cluster systems with non-exclusive scheduling

4. DaeMon—User Manualhttps://www.arcos.inf.uc3m.es/acascajo/daemon/

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Detecting interference between applications and improving the scheduling using malleable application clones;The International Journal of High Performance Computing Applications;2023-12-13

2. Monitoring InfiniBand Networks to React Efficiently to Congestion;IEEE Micro;2023-03-01

3. LIMITLESS — LIght-weight MonItoring Tool for LargE Scale Systems;Microprocessors and Microsystems;2022-09

4. Improving Congestion Control through Fine-Grain Monitoring of InfiniBand Networks;2022 IEEE Symposium on High-Performance Interconnects (HOTI);2022-08

5. Energy Consumption Studies of WRF Executions with the LIMITLESS Monitor;Communications in Computer and Information Science;2022