High performance locks for multi-level NUMA systems-Reference-Cited by-同舟云学术

High performance locks for multi-level NUMA systems

Published:2015-12-18 Issue:8 Volume:50 Page:215-226
ISSN:0362-1340
Container-title:ACM SIGPLAN Notices
language:en
Short-container-title:SIGPLAN Not.

Author:

Chabbi Milind¹,Fagan Michael¹,Mellor-Crummey John¹

Affiliation:

1. Rice University, USA

Abstract

Efficient locking mechanisms are critically important for high performance computers. On highly-threaded systems with a deep memory hierarchy, the throughput of traditional queueing locks, e.g., MCS locks, falls off due to NUMA effects. Two-level cohort locks perform better on NUMA systems, but fail to deliver top performance for deep NUMA hierarchies. In this paper, we describe a hierarchical variant of the MCS lock that adapts the principles of cohort locking for architectures with deep NUMA hierarchies. We describe analytical models for throughput and fairness of Cohort-MCS (C-MCS) and Hierarchical MCS (HMCS) locks that enable us to tailor these locks for high performance on any target platform without empirical tuning. Using these models, one can select parameters such that an HMCS lock will deliver better fairness than a C-MCS lock for a given throughput, or deliver better throughput for a given fairness. Our experiments show that, under high contention, a three-level HMCS lock delivers up to 7.6x higher lock throughput than a C-MCS lock on a 128-thread IBM Power 755 and a five-level HMCS lock delivers up to 72x higher lock throughput on a 4096-thread SGI UV 1000. On the K-means clustering code from the MineBench suit, a three-level HMCS lock reduces the running time by up to 55% compared to the C-MCS lock on a IBM Power 755.

Funder

U.S. Department of Energy

Lawrence Berkely National Laboratory

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Graphics and Computer-Aided Design,Software

Link

https://dl.acm.org/doi/pdf/10.1145/2858788.2688503

Reference9 articles.

1. Flat-combining NUMA locks

2. Lock cohorting

3. A Hierarchical CLH Queue Lock

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. CAL: Core-Aware Lock for the big.LITTLE Multicore Architecture;Applied Sciences;2024-07-24

2. Lightweight Latches for B-Trees to Cope with High Contention;Lecture Notes in Computer Science;2024

3. Protecting Locks Against Unbalanced Unlock();Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures;2023-06-17

4. A NUMA-Aware Recoverable Mutex Lock;Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures;2022-07-11

5. Asymmetry-aware scalable locking;Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming;2022-03-28