ITERATIVE AND PARALLEL PERFORMANCE ANALYSIS OF NON-BLOCKING COMMUNICATION ALGORITHMS IN THE MASSIVELY PARALLEL NEUTRON TRANSPORT CODE PIDOTS

Author:

Yessayan Raffi,Azmy Yousry Y.,Zerr R. Joseph

Abstract

The PIDOTS neutral particle transport code utilizes a red/black implementation of the Parallel Gauss-Seidel algorithm to solve the SN approximation of the neutron transport equation on 3D Cartesian meshes. PIDOTS is designed for execution on massively parallel platforms and is capable of using the full resources of modern, leadership class high performance computers. Initial testing revealed that some configurations of PIDOTS’s Integral Transport Matrix Method solver demonstrated unexpectedly poor parallel scaling. Work at Idaho and Los Alamos National Laboratories then revealed that this inefficiency was a result of the accumulation of high-cost latency events in the complex blocking communication networks employed during each PIDOTS iteration. That work explored the possibility of minimizing those inefficiencies while maintaining a blocking communications model. While significant speedups were obtained, it was shown that fully mitigating the problem on general-purpose platforms was highly unlikely for a blocking code. This work continues that analysis by implementing a deeply interleaved non-blocking communication model into PIDOTS. This new model benefits from the optimization work performed on the blocking model while also providing significant opportunities to overlap the remaining un-mitigated communication costs with computation. Additionally, our new approach is easily transferable to other similarly spatially decomposed codes. The resulting algorithm was tested on LANL’s Trinity system at up to 32,768 processors and was found at that processor count to effectively hide 100% of MPI communication cost – equivalently 20% of the red/black phase time. It is expected that the implemented interleaving algorithm can fully support far higher processor counts and completely hide communication costs up ~50% of total iteration time.

Publisher

EDP Sciences

Reference6 articles.

1. Yessayan R., Azmy Y. Y., Schunert S., Garvey C., “Analysis of Communication Performance Degradation of The Radiation Transport Code PIDOTS On High-Utilization, Multi-User HPC Systems”, Proceedings of PHYSOR 2018, Cancun, Mexico, April 22-26, 2018

2. Yessayan R., Azmy Y. Y., Zerr R. J., Schunert S., “Mitigation of Communication Latency Based Slowdowns in the PIDOTS Massively Parallel SN Transport Code”, Proc. of M&C 2019, Portland, OR, Aug 25-29, 2019

3. “Trinity: Technical Specifications,” https://www.lanl.gov/projects/trinity/specifications.php

4. Gabriel E., et al., “Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation,” Proceedings of 11th European PVM/MPI User’s Group Meeting, Budapest, Hungary, Sept. 2004

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3