Affiliation:
1. University of Cantabria, Spain
Abstract
Although abstraction is the best approach to deal with computing system complexity, sometimes implementation details should be considered. Considering on-chip interconnection networks in particular, underestimating the underlying system specificity could have nonnegligible impact on performance, cost, or correctness. This article presents a very efficient router that has been devised to deal with cache-coherent chip multiprocessor particularities in a balanced way. Employing the same principles of packet rotation structures as in the rotary router, we present a router configuration with the following novel features: (1) reduced buffering requirements, (2) optimized pipeline under contentionless conditions, (3) more efficient deadlock avoidance mechanism, and (4) optimized in-order delivery guarantee. Putting it all together, our proposal provides a set of features that no other router, to the best of our knowledge, has achieved previously. These are: (1') low implementation cost, (2') low pass-through latency under low load, (3') improved resource utilization through adaptive routing and a buffering scheme free of head-of-line blocking, (4') guarantee of coherence protocol correctness via end-to-end deadlock avoidance and in-order delivery, and (5') improvement of coherence protocol responsiveness through adaptive in-network multicast support. We conduct a thorough evaluation that includes hardware cost estimation and performance evaluation under a wide spectrum of realistic workloads and coherence protocols. Comparing our proposal with VCTM, an optimized state-of-the-art wormhole router, it requires 50% less area, reduces on-chip cache hierarchy energy delay product on average by 20%, and improves the cache-coherency chip multiprocessor performance under realistic working conditions by up to 20%.
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Reference40 articles.
1. Rotary router
2. TOPAZ: An Open-Source Interconnection Network Simulator for Chip Multiprocessors and Supercomputers
3. Asanovic K. Bodik R. and Catanzaro B. 2006. The landscape of parallel computing research: A view from Berkeley. Tech. rep. UCB/EECS-2006-183 EECS Department University of California Berkeley. Asanovic K. Bodik R. and Catanzaro B. 2006. The landscape of parallel computing research: A view from Berkeley. Tech. rep. UCB/EECS-2006-183 EECS Department University of California Berkeley.
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献