Supporting dynamic allocation of heterogeneous storage resources on HPC systems-Reference-Cited by-同舟云学术

Supporting dynamic allocation of heterogeneous storage resources on HPC systems

Published:2023-08-16 Issue:28 Volume:35 Page:
ISSN:1532-0626
Container-title:Concurrency and Computation: Practice and Experience
language:en
Short-container-title:Concurrency and Computation

Author:

Monniot Julien¹^ORCID,Tessier François¹,Robert Matthieu¹,Antoniu Gabriel¹

Affiliation:

1. Univ Rennes, Inria, CNRS, IRISA Rennes France

Abstract

SummaryScaling up large‐scale scientific applications on supercomputing facilities is largely dependent on the ability to scale up efficiently data storage and retrieval. However, there is an ever‐widening gap between I/O and computing performance. To address this gap, an increasingly popular approach consists in introducing new intermediate storage tiers (node‐local storage, burst‐buffers,) between the compute nodes and the traditional global shared parallel file‐system. Unfortunately, without advanced techniques to allocate and size these resources, they remain underutilized. In this article, we investigate how heterogeneous storage resources can be allocated on an high‐performance computing platform, just like compute resources. To this purpose, we introduce StorAlloc, a simulator used as a testbed for assessing storage‐aware job scheduling algorithms and evaluating various storage infrastructures. We illustrate its usefulness by showing through a large series of experiments how this tool can be used to size a burst‐buffer partition on a top‐tier supercomputer by using the job history of a production year.

Publisher

Wiley

Subject

Computational Theory and Mathematics,Computer Networks and Communications,Computer Science Applications,Theoretical Computer Science,Software

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.7890

Reference36 articles.

1. HenselerD LandsteinerB PeteschD WrightC WrightNJ.Architecture and design of Cray DataWarp. Proceedings of 2016 Cray User Group (CUG) Meeting;2016.

2. CornebizeT.High Performance Computing: towards Better Performance Predictions and Experiments. Theses. Université Grenoble Alpes; June 2021.https://theses.hal.science/tel‐03328956

3. MonniotJ TessierF RobertM AntoniuG.StorAlloc: a simulator for job scheduling on heterogeneous storage resources. HeteroPar;2022.https://hal.inria.fr/hal‐03683568

4. Lustre filesystem.https://www.lustre.org/