NUMA-aware shared-memory collective communication for MPI-Reference-Cited by-同舟云学术

NUMA-aware shared-memory collective communication for MPI

Published:2013-06-17 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
language:
Short-container-title:

Author:

Li Shigang¹,Hoefler Torsten²,Snir Marc³

Affiliation:

1. University of Science and Technology Beijing, Beijing, China

2. ETH Zurich, Zurich, Switzerland

3. University of Illinois at Urbana-Champaign and Argonne National Laboratory, Urbana, IL, USA

Publisher

ACM

Reference28 articles.

1. AMD. Software Optimization Guide for AMD Family 15h Processors January 2012.

2. R. Aulwes D. Daniel N. Desai R. Graham L. Risinger M. Taylor T. Woodall and M. Sukalski. Architecture of LA-MPI a network-fault-tolerant MPI. In Proceedings of the 18th International Parallel and Distributed Processing Symposium. page 15 April 2004.

3. F. Blagojevic P. Hargrove C. Iancu and K. Yelick. Hybrid PGAS runtime support for multicore nodes. In Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model PGAS '10 pages 3:1--3:10. ACM 2010. 10.1145/2020373.2020376

4. F. Broquedis J. Clet-Ortega S. Moreaud N. Furmento B. Goglin G. Mercier S. Thibault and R. Namyst. hwloc: A generic framework for managing hardware affinities in HPC applications. In Proceedings of the 2010 18th Euromicro Conference on Parallel Distributed and Network-based Processing PDP '10 pages 180--186. IEEE Computer Society 2010. 10.1109/PDP.2010.67

5. A. Friedley T. Hoe er G. Bronevetsky A. Lumsdaine and C.-C. Ma. Ownership Passing: efficient distributed memory programming on multi-core systems. In PPoPP'13. Proceedings of the 18th ACM symposium on Principles and Practice of Parallel Programming 2013. Accepted at PPoPP'13. 10.1145/2442516.2442534

Cited by 11 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Optimizing MPI Collectives on Shared Memory Multi-Cores;Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis;2023-11-11

2. FMI: Fast and Cheap Message Passing for Serverless Functions;Proceedings of the 37th International Conference on Supercomputing;2023-06-21

3. Optimizing Mpi Collectives with Hierarchical Design for Efficient Cpu Oversubscription;2023

4. Hybrid Approach to Optimize MPI Collectives by In-network-computation and Point-to-Point Messages;2022 7th International Conference on Computer and Communication Systems (ICCCS);2022-04-22

5. Near-optimal sparse allreduce for distributed deep learning;Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming;2022-03-28