An analysis of on-chip interconnection networks for large-scale chip multiprocessors-Reference-Cited by-同舟云学术

An analysis of on-chip interconnection networks for large-scale chip multiprocessors

Published:2010-04 Issue:1 Volume:7 Page:1-28
ISSN:1544-3566
Container-title:ACM Transactions on Architecture and Code Optimization
language:en
Short-container-title:ACM Trans. Archit. Code Optim.

Author:

Sanchez Daniel¹,Michelogiannakis George¹,Kozyrakis Christos¹

Affiliation:

1. Stanford University, Stanford, CA

Abstract

With the number of cores of chip multiprocessors (CMPs) rapidly growing as technology scales down, connecting the different components of a CMP in a scalable and efficient way becomes increasingly challenging. In this article, we explore the architectural-level implications of interconnection network design for CMPs with up to 128 fine-grain multithreaded cores. We evaluate and compare different network topologies using accurate simulation of the full chip, including the memory hierarchy and interconnect, and using a diverse set of scientific and engineering workloads. We find that the interconnect has a large impact on performance, as it is responsible for 60% to 75% of the miss latency. Latency, and not bandwidth, is the primary performance constraint, since, even with many threads per core and workloads with high miss rates, networks with enough bandwidth can be efficiently implemented for the system scales we consider. From the topologies we study, the flattened butterfly consistently outperforms the mesh and fat tree on all workloads, leading to performance advantages of up to 22%. We also show that considering interconnect and memory hierarchy together when designing large-scale CMPs is crucial, and neglecting either of the two can lead to incorrect conclusions. Finally, the effect of the interconnect on overall performance becomes more important as the number of cores increases, making interconnection choices especially critical when scaling up.

Funder

Division of Computing and Communication Foundations

U.S. Department of Energy

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Information Systems,Software

Link

https://dl.acm.org/doi/pdf/10.1145/1736065.1736069

Reference53 articles.

1. Clock rate versus IPC

2. Microarchitectural Wire Management for Performance and Power in Partitioned Architectures

3. Design tradeoffs for tiled CMP on-chip networks

Cited by 34 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Cost-effectiveness analysis of a shifted completely connected network for massively parallel computer systems;2024-02-15

2. Scalable Architectures of Neuromorphic Computing;Synthesis Lectures on Engineering, Science, and Technology;2024

3. An Extensive Power and Performance Analysis for High Dimensional Mesh and Torus Interconnection Networks;International Journal of Distributed Systems and Technologies;2023-04-14

4. ANSA: Adaptive Near-Sensor Architecture for Dynamic DNN Processing in Compact Form Factors;IEEE Transactions on Circuits and Systems I: Regular Papers;2023-03

5. A Survey of On-Chip Hybrid Interconnect for Multicore Architectures;Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering;2023