Affiliation:
1. Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland
Abstract
As the level of parallelism in manycore processors keeps increasing, providing efficient mechanisms for thread synchronization in concurrent programs is becoming a major concern. On cache-coherent shared-memory processors, synchronization efficiency is ultimately limited by the performance of the underlying cache coherence protocol. This article studies how hardware support for message passing can improve synchronization performance. Considering the ubiquitous problem of mutual exclusion, we devise novel algorithms for (i) classic locking, where application threads obtain exclusive access to a shared resource prior to executing their critical sections (CSes), and (ii) delegation, where CSes are executed by special threads. For classic locking, our
HybLock
algorithm uses a mix of shared memory and hardware message passing, which introduces the idea of hybrid synchronization algorithms. For delegation, we propose
mp-server
and
HybComb
: the former is a straightforward adaptation of the server approach to hardware message passing, whereas the latter is a novel hybrid combining algorithm. Evaluation on Tilera's TILE-Gx processor shows that
HybLock
outperforms the best known classic locks. Furthermore,
mp-server
can execute contended CSes with unprecedented throughput, as stalls related to cache coherence are removed from the critical path.
HybComb
can achieve comparable performance while avoiding the need to dedicate server cores. Consequently, our queue and stack implementations, based on the new synchronization algorithms, largely outperform their most efficient shared-memory-only counterparts.
Publisher
Association for Computing Machinery (ACM)
Subject
Computational Theory and Mathematics,Computer Science Applications,Hardware and Architecture,Modeling and Simulation,Software
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Mitigating Message Passing Interference in Trusted Embedded Platforms;2023 20th International SoC Design Conference (ISOCC);2023-10-25
2. DySHARQ: Dynamic Software-Defined Hardware-Managed Queues for Tile-Based Architectures;International Journal of Parallel Programming;2020-11-20
3. Fast Fine-Grained Global Synchronization on GPUs;Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems;2019-04-04
4. SHARQ: Software-Defined Hardware-Managed Queues for Tile-Based Manycore Architectures;Lecture Notes in Computer Science;2019