I/O performance of the Santos Dumont supercomputer-Reference-Cited by-同舟云学术

I/O performance of the Santos Dumont supercomputer

Published:2019-09-12 Issue:2 Volume:34 Page:227-245
ISSN:1094-3420
Container-title:The International Journal of High Performance Computing Applications
language:en
Short-container-title:The International Journal of High Performance Computing Applications

Author:

Bez Jean Luca¹^ORCID,Carneiro André Ramos²,Pavan Pablo José¹,Girelli Valéria Soldera¹,Boito Francieli Zanon³,Fagundes Bruno Alves²,Osthoff Carla²,da Silva Dias Pedro Leite⁴,Méhaut Jean-François³,Navaux Philippe OA¹

Affiliation:

1. Institute of Informatics, Federal University of Rio Grande do Sul (UFRGS), Porto Alegre, Brazil

2. Laboratory for Scientific Computing (LNCC), Petrópolis, Brazil

3. Université Grenoble Alpes, Inria, CNRS, Grenoble INP, LIG, Grenoble, France

4. Institute of Astronomy, Geophysics and Atmospheric Sciences, University of São Paulo (USP), São Paulo, Brazil

Abstract

In this article, we study the I/O performance of the Santos Dumont supercomputer, since the gap between processing and data access speeds causes many applications to spend a large portion of their execution on I/O operations. For a large-scale expensive supercomputer, it is essential to ensure applications achieve the best I/O performance to promote efficient usage. We monitor a week of the machine’s activity and present a detailed study on the obtained metrics, aiming at providing an understanding of its workload. From experiences with one numerical simulation, we identified large I/O performance differences between the MPI implementations available to users. We investigated the phenomenon and narrowed it down to collective I/O operations with small request sizes. For these, we concluded that the customized MPI implementation by the machine’s vendor (used by more than 20% of the jobs) presents the worst performance. By investigating the issue, we provide information to help improve future MPI-IO collective write implementations and practical guidelines to help users and steer future system upgrades. Finally, we discuss the challenge of describing applications I/O behavior without depending on information from users. That allows for identifying the application’s I/O bottlenecks and proposing ways of improving its I/O performance. We propose a methodology to do so, and use GROMACS, the application with the largest number of jobs in 2017, as a case study.

Funder

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior

Publisher

SAGE Publications

Subject

Hardware and Architecture,Theoretical Computer Science,Software

Link

http://journals.sagepub.com/doi/pdf/10.1177/1094342019868526

Reference23 articles.

1. A Checkpoint of Research on Parallel I/O for High-Performance Computing

2. A New MPI Implementation for Cray SHMEM

3. A Comparison of Three MPI Implementations for Red Storm