Affiliation:
1. University of Rochester, Rochester, NY
Abstract
With technology scaling, on-chip power dissipation and off-chip memory bandwidth have become significant performance bottlenecks in virtually all computer systems, from mobile devices to supercomputers. An effective way of improving performance in the face of bandwidth and power limitations is to rely on associative memory systems. Recent work on a PCM-based, associative TCAM accelerator shows that associative search capability can reduce both off-chip bandwidth demand and overall system energy. Unfortunately, previously proposed resistive TCAM accelerators have limited flexibility: only a restricted (albeit important) class of applications can benefit from a TCAM accelerator, and the implementation is confined to resistive memory technologies with a high dynamic range (
R
High
/R
Low
), such as PCM.
This work proposes AC-DIMM, a flexible, high-performance associative compute engine built on a DDR3-compatible memory module. AC-DIMM addresses the limited flexibility of previous resistive TCAM accelerators by combining two powerful capabilities---associative search and processing in memory. Generality is improved by augmenting a TCAM system with a set of integrated, user programmable microcontrollers that operate directly on search results, and by architecting the system such that key-value pairs can be co-located in the same TCAM row. A new, bit-serial TCAM array is proposed, which enables the system to be implemented using STT-MRAM. AC-DIMM achieves a 4.2X speedup and a 6.5X energy reduction over a conventional RAM-based system on a set of 13 evaluated applications.
Funder
International Business Machines Corporation
Division of Computing and Communication Foundations
New York State Office of Science and Technology
Qualcomm
Cisco Systems
Samsung
Publisher
Association for Computing Machinery (ACM)
Reference47 articles.
1. Design Compiler Command-Line Interface Guide. http://www.synopsys.com/. Design Compiler Command-Line Interface Guide . http://www.synopsys.com/.
2. Free PDK 45nm open-access based PDK for the 45nm technology node. http://www.eda.ncsu.edu/wiki/FreePDK. Free PDK 45nm open-access based PDK for the 45nm technology node. http://www.eda.ncsu.edu/wiki/FreePDK.
3. Advanced Micro Devices Inc. AMD64 Architecture Programmer's Manual Volume 2: System Programming 2010. Advanced Micro Devices Inc. AMD64 Architecture Programmer's Manual Volume 2: System Programming 2010.
4. Hybrid CMOS/nanodevice circuits for high throughput pattern matching applications
Cited by
41 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献