Abstract
Efficient locking mechanisms are critically important for high performance computers. On highly-threaded systems with a deep memory hierarchy, the throughput of traditional queueing locks, e.g., MCS locks, falls off due to NUMA effects. Two-level cohort locks perform better on NUMA systems, but fail to deliver top performance for deep NUMA hierarchies. In this paper, we describe a hierarchical variant of the MCS lock that adapts the principles of cohort locking for architectures with deep NUMA hierarchies. We describe analytical models for throughput and fairness of Cohort-MCS (C-MCS) and Hierarchical MCS (HMCS) locks that enable us to tailor these locks for high performance on any target platform without empirical tuning. Using these models, one can select parameters such that an HMCS lock will deliver better fairness than a C-MCS lock for a given throughput, or deliver better throughput for a given fairness. Our experiments show that, under high contention, a three-level HMCS lock delivers up to 7.6x higher lock throughput than a C-MCS lock on a 128-thread IBM Power 755 and a five-level HMCS lock delivers up to 72x higher lock throughput on a 4096-thread SGI UV 1000. On the K-means clustering code from the MineBench suit, a three-level HMCS lock reduces the running time by up to 55% compared to the C-MCS lock on a IBM Power 755.
Funder
U.S. Department of Energy
Lawrence Berkely National Laboratory
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. CAL: Core-Aware Lock for the big.LITTLE Multicore Architecture;Applied Sciences;2024-07-24
2. Lightweight Latches for B-Trees to Cope with High Contention;Lecture Notes in Computer Science;2024
3. Protecting Locks Against Unbalanced Unlock();Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures;2023-06-17
4. A NUMA-Aware Recoverable Mutex Lock;Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures;2022-07-11
5. Asymmetry-aware scalable locking;Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming;2022-03-28