Affiliation:
1. ETH Zurich, Switzerland
Abstract
Modern high-performance networks offer remote direct memory access (RDMA) that exposes a process' virtual address space to other processes in the network. The
Message Passing Interface
(MPI) specification has recently been extended with a programming interface called MPI-3
Remote Memory Access
(MPI-3 RMA) for efficiently exploiting state-of-the-art RDMA features. MPI-3 RMA enables a powerful programming model that alleviates many message passing downsides. In this work, we design and develop bufferless protocols that demonstrate how to implement this interface and support scaling to millions of cores with negligible memory consumption while providing highest performance and minimal overheads. To arm programmers, we provide a spectrum of performance models for RMA functions that enable rigorous mathematical analysis of application performance and facilitate the development of codes that solve given tasks within specified time and energy budgets. We validate the usability of our library and models with several application studies with up to half a million processes. In a wider sense, our work illustrates how to use RMA principles to accelerate computation- and data-intensive codes.
Publisher
Association for Computing Machinery (ACM)
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Decentralized lock-free distributed queue in MPI remote memory access model;E3S Web of Conferences;2024
2. Modularis;Proceedings of the VLDB Endowment;2021-09
3. Parallel Tree Algorithms for AMR and Non-Standard Data Access;ACM Transactions on Mathematical Software;2020-12-31
4. High-Performance Parallel Graph Coloring with Strong Guarantees on Work, Depth, and Quality;SC20: International Conference for High Performance Computing, Networking, Storage and Analysis;2020-11