Affiliation:
1. Cornell University, Ithaca, NY
2. University of Central Florida
Abstract
We introduce the SMTp architecture-an SMT processoraugmented with a coherence protocol thread context,that together with a standard integrated memory controllercan enable the design of (among other possibilities) scalablecache-coherent hardware distributed shared memory(DSM) machines from commodity nodes. We describe theminor changes needed to a conventional out-of-order multi-threadedcore to realize SMTp, discussing issues related toboth deadlock avoidance and performance. We then compareSMTp performance to that of various conventionalDSM machines with normal SMT processors both with andwithout integrated memory controllers. On configurationsfrom 1 to 32 nodes, with 1 to 4 application threads pernode, we find that SMTp delivers performance comparableto, and sometimes better than, machines with more complexintegrated DSM-specific memory controllers. Our resultsalso show that the protocol thread has extremely lowpipeline overhead. Given the simplicity and the flexibility ofthe SMTp mechanism, we argue that next-generation multi-threadedprocessors with integrated memory controllersshould adopt this mechanism as a way of building less complexhigh-performance DSM multiprocessors.
Publisher
Association for Computing Machinery (ACM)
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Middleware Memory Management in NoC;Designing 2D and 3D Network-on-Chip Architectures;2013-10-09