Affiliation:
1. Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor
Abstract
Advances in microarchitecture, packaging, and manufacturing processes enable designers to build new systems with higher performance and scalability. Using microbenchmark techniques, we contrast the memory and communication performance of two generations of the HP/Convex Exemplar scalable parallel processing system. The SPP1000 and SPP2000 have significant architectural and implementation differences, but maintain upward binary compatibility. The SPP2000 employs manufacturing and packaging advances to obtain shorter system interconnects with wider data paths and improved functionality, thereby reducing the latency and increasing the bandwidth of remote communication. Although the memory latency is not significantly improved, newer out-of-order execution processors coupled with nonblocking caches achieve much higher memory bandwidth. The SPP2000 has a richer system interconnect topology that allows scalability to a larger number of processors. The SPP2000 also employs innovations in its coherence protocols to improve synchronization and communication performance. This paper characterizes the performance effects of these changes, and identifies some remaining inefficiencies, in the cache coherence protocol and the node configuration, that future systems should address.
Publisher
Association for Computing Machinery (ACM)
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献