Affiliation:
1. Siberian State University of Telecommunications and Information Science
Abstract
A hierarchical MPI barrier synchronization algorithm creating groups of processes that share common resources at the memory hierarchy levels (L2/L3 caches, NUMA node, socket) is proposed. Synchronization is performed in groups at each level of the hierarchy. Experiments on a dual-socket server with two Huawei Kunpeng processors (128 cores, 4 NUMA nodes) showed that the proposed algorithm with NUMA nodes process grouping provides the minimum execution time compared to known methods and is resistant to different schemes of process placement.
Publisher
Siberian State University of Telecommunications and Informatics
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. MPI reduction and broadcast algorithms for computer clusters with multistage interconnection networks;The Herald of the Siberian State University of Telecommunications and Information Science;2023-09-28