Exploiting copy engines for intra-node MPI collective communication
-
Published:2023-05-11
Issue:16
Volume:79
Page:17962-17982
-
ISSN:0920-8542
-
Container-title:The Journal of Supercomputing
-
language:en
-
Short-container-title:J Supercomput
Author:
Cho Joong-Yeon,Seo Pu-Rum,Jin Hyun-Wook
Abstract
AbstractAs multi/many-core processors are widely deployed in high-performance computing systems, efficient intra-node communication becomes more important. Intra-node communication involves data copy operations to move messages from source to destination buffer. Researchers have tried to reduce the overhead of this copy operation, but the copy operation performed by CPU still wastes the CPU resources and even hinders overlapping between computation and communication. The copy engine is a hardware component that can move data between intra-node buffers without intervention of CPU. Thus, we can offload the copy operation performed by CPU onto the copy engine. In this paper, we aim at exploiting copy engines for MPI blocking collective communication, such as broadcast and gather operations. MPI is a messaging-based parallel programming model and provides point-to-point, collective, and one-sided communications. Research has been conducted to utilize the copy engine for MPI, but the support for collective communication has not yet been studied. We propose the asynchronism in blocking collective communication and the CE-CPU hybrid approach to utilize both copy engine and CPU for intra-node collective communication. The measurement results show that the proposed approach can reduce the overall execution time of a microbenchmark and a synthetic application that perform collective communication and computation up to 72% and 57%, respectively.
Funder
Ministry of Science and ICT, South Korea
Publisher
Springer Science and Business Media LLC
Subject
Hardware and Architecture,Information Systems,Theoretical Computer Science,Software
Reference33 articles.
1. Message Passing Interface. https://www.mpi-forum.org/. Accessed 26 Feb 2023
2. MPICH: high-performance portable MPI. https://www.mpich.org/. Accessed 26 Feb 2023
3. MVAPICH: MPI over infiniBand, omni-path, ethernet/iWARP, and RoCE. http://mvapich.cse.ohio-state.edu/. Accessed 26 Feb 2023
4. Open MPI: open source high performance computing. https://www.open-mpi.org/. Accessed 26 Feb 2023
5. Chai L, Hartono A, Panda DK Designing high performance and scalable mpi intra-node communication support for clusters. In: 2006 IEEE International Conference on Cluster Computing, pp. 1–10 (2006). IEEE
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献