Affiliation:
1. Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, MA
Abstract
Shared-memory multiprocessors commonly use shared variables for synchronization. Our simulations of real parallel applications show that large-scale cache-coherent multiprocessors suffer significant amounts of invalidation traffic due to synchronization. Large multiprocessors that do not cache synchronization variables are often more severely impacted. If this synchronization traffic is not reduced or managed adequately, synchronization references can cause severe congestion in the network. We propose a class of adaptive back-off methods that do not use any extra hardware and can significantly reduce the memory traffic to synchronization variables. These methods use synchronization state to reduce polling of synchronization variables. Our simulations show that when the number of processors participating in a barrier synchronization is small compared to the time of arrival of the processors, reductions of 20 percent to over 95 percent in synchronization traffic can be achieved at no extra cost. In other situations adaptive backoff techniques result in a tradeoff between reduced network accesses and increased processor idle time.
Publisher
Association for Computing Machinery (ACM)
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. CLoF;Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles CD-ROM;2021-10-26
2. Power-aware pipelining with automatic concurrency control;Concurrency and Computation: Practice and Experience;2018-08-14
3. Lock Cohorting;ACM Transactions on Parallel Computing;2015-02-18
4. Effective Barrier Synchronization on Intel Xeon Phi Coprocessor;Lecture Notes in Computer Science;2015
5. Lock cohorting;ACM SIGPLAN Notices;2012-09-11