Affiliation:
1. Intel Corporation
2. Intel Corporation and MIT
Abstract
As microprocessor designs integrate more cores, scalability of cache coherence protocols becomes a challenging problem. Most directory-based protocols avoid races by using blocking tag directories that can impact the performance of parallel applications. In this article, we first quantitatively demonstrate that state-of-the-art blocking protocols significantly constrain throughput at large core counts for several parallel applications. Nonblocking protocols address this throughput concern at the expense of scalability in the interconnection network or in the required resource overheads. To address this concern, we enhance nonblocking directory protocols by migrating the point of service of responses. Our approach uses in-flight chains of cores making parallel memory requests to incorporate scalability while maintaining high-throughput. The proposed cache coherence protocol called
chained cache coherence
, can outperform blocking protocols by up to 20% on scientific and 12% on commercial applications. It also has low resource overheads and simple address ordering requirements making it both a high-performance and scalable protocol. Furthermore, in-flight chains provide a scalable solution to building hierarchical and nonblocking tag directories as well as optimize communication latencies.
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Reference44 articles.
1. Bailey D. H. 1994. The NAS Parallel Benchmarks. www.davidhbailey.com/dhbpapers/npb-encycpc.pdf. Bailey D. H. 1994. The NAS Parallel Benchmarks. www.davidhbailey.com/dhbpapers/npb-encycpc.pdf.
2. The gem5 simulator
3. CLOMP: Accurately Characterizing OpenMP Application Overheads
4. Directory-based cache coherence in large-scale multiprocessors
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. DeNovoSync;ACM SIGARCH Computer Architecture News;2015-05-29
2. DeNovoSync;ACM SIGPLAN Notices;2015-05-12
3. DeNovoSync;Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems;2015-03-14