Damaris

Author:

Dorier Matthieu1,Antoniu Gabriel2,Cappello Franck1,Snir Marc1,Sisneros Robert3,Yildiz Orcun2,Ibrahim Shadi2,Peterka Tom1,Orf Leigh4

Affiliation:

1. Argonne National Laboratory, IL, USA

2. Inria, Rennes - Bretagne Atlantique Research Centre, France

3. University of Illinois at Urbana Champaign, IL

4. University of Wisconsin - Madison, WI

Abstract

With exascale computing on the horizon, reducing performance variability in data management tasks (storage, visualization, analysis, etc.) is becoming a key challenge in sustaining high performance. This variability significantly impacts the overall application performance at scale and its predictability over time. In this article, we present Damaris, a system that leverages dedicated cores in multicore nodes to offload data management tasks, including I/O, data compression, scheduling of data movements, in situ analysis, and visualization. We evaluate Damaris with the CM1 atmospheric simulation and the Nek5000 computational fluid dynamic simulation on four platforms, including NICS’s Kraken and NCSA’s Blue Waters. Our results show that (1) Damaris fully hides the I/O variability as well as all I/O-related costs, thus making simulation performance predictable; (2) it increases the sustained write throughput by a factor of up to 15 compared with standard I/O approaches; (3) it allows almost perfect scalability of the simulation up to over 9,000 cores, as opposed to state-of-the-art approaches that fail to scale; and (4) it enables a seamless connection to the VisIt visualization software to perform in situ analysis and visualization in a way that impacts neither the performance of the simulation nor its variability. In addition, we extended our implementation of Damaris to also support the use of dedicated nodes and conducted a thorough comparison of the two approaches—dedicated cores and dedicated nodes—for I/O tasks with the aforementioned applications.

Publisher

Association for Computing Machinery (ACM)

Subject

Computational Theory and Mathematics,Computer Science Applications,Hardware and Architecture,Modelling and Simulation,Software

Reference75 articles.

1. DataStager

2. Scalable I/O forwarding framework for high-performance computing systems

3. ANL. 2015. MPICH. Retrieved from http://www.mpich.org. ANL. 2015. MPICH. Retrieved from http://www.mpich.org.

4. A Benchmark Simulation for Moist Nonhydrostatic Numerical Models

Cited by 20 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Detecting interference between applications and improving the scheduling using malleable application clones;The International Journal of High Performance Computing Applications;2023-12-13

2. Dask-Extended External Tasks for HPC/ML In transit Workflows;Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis;2023-11-12

3. I/O Access Patterns in HPC Applications: A 360-Degree Survey;ACM Computing Surveys;2023-09-15

4. Towards elastic in situ analysis for high-performance computing simulations;Journal of Parallel and Distributed Computing;2023-07

5. Automated Continual Learning of Defect Identification in Coherent Diffraction Imaging;2022 IEEE/ACM International Workshop on Artificial Intelligence and Machine Learning for Scientific Applications (AI4S);2022-11

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3