Scalable communication protocols for dynamic sparse data exchange-Reference-Cited by-同舟云学术

Scalable communication protocols for dynamic sparse data exchange

Published:2010-05 Issue:5 Volume:45 Page:159-168
ISSN:0362-1340
Container-title:ACM SIGPLAN Notices
language:en
Short-container-title:SIGPLAN Not.

Author:

Hoefler Torsten¹,Siebert Christian²,Lumsdaine Andrew¹

Affiliation:

1. Indiana University, Bloomington, IN, USA

2. NEC Laboratories Europe, Sankt Augustin, Germany

Abstract

Many large-scale parallel programs follow a bulk synchronous parallel (BSP) structure with distinct computation and communication phases. Although the communication phase in such programs may involve all (or large numbers) of the participating processes, the actual communication operations are usually sparse in nature. As a result, communication phases are typically expressed explicitly using point-to-point communication operations or collective operations. We define the dynamic sparse data-exchange (DSDE) problem and derive bounds in the well known LogGP model. While current approaches work well with static applications, they run into limitations as modern applications grow in scale, and as the problems that are being solved become increasingly irregular and dynamic. To enable the compact and efficient expression of the communication phase, we develop suitable sparse communication protocols for irregular applications at large scale. We discuss different irregular applications and show the sparsity in the communication for real-world input data. We discuss the time and memory complexity of commonly used protocols for the DSDE problem and develop NBX --a novel fast algorithm with constant memory overhead for solving it. Algorithm NBX improves the runtime of a sparse data-exchange among 8,192 processors on BlueGene/P by a factor of 5.6. In an application study, we show improvements of up to a factor of 28.9 for a parallel breadth first search on 8,192 BlueGene/P processors.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Graphics and Computer-Aided Design,Software

Link

https://dl.acm.org/doi/pdf/10.1145/1837853.1693476

Reference30 articles.

1. A bridging model for parallel computation

2. Sparse collective operations for MPI

3. LogGP: Incorporating Long Messages into the LogP Model for Parallel Computation

Cited by 21 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Parallel Pattern Language Code Generation;Proceedings of the 15th International Workshop on Programming Models and Applications for Multicores and Manycores;2024-03-03

2. Scalable adaptive algorithms for next-generation multiphase flow simulations;2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS);2023-05

3. Efficient Distributed Matrix-free Multigrid Methods on Locally Refined Meshes for FEM Computations;ACM Transactions on Parallel Computing;2023-03-29

4. Scalable computational kernels for mortar finite element methods;Engineering with Computers;2023-01-25

5. The deal.II library, Version 9.4;Journal of Numerical Mathematics;2022-07-17