Abstract
Spintronic memories are considered to be promising candidates for future on-chip memories due to their high density, nonvolatility, and near-zero leakage. However, they also face challenges such as high write energy and latency and limited read speed due to single-ended sensing. Further, the conflicting requirements of read and write operations lead to stringent design constraints that severely compromises their benefits.
Recently, domain wall memory was proposed as a spintronic memory that has a potential for very high density by storing multiple bits in the domains of a ferromagnetic nanowire. While reliable operation of DWM memory with multiple domains faces many challenges, single-bit cells that utilize domain wall motion for writes have been experimentally demonstrated [Fukami et al. 2009]. This bit-cell, which we refer to as Domain Wall Memory with Shift-based Write (DWM-SW), achieves improved write efficiency and features decoupled read-write paths, enabling independent optimizations of read and write operations. However, these benefits are achieved at the cost of sacrificing the original goal of improved density. In this work, we explore multilevel storage as a new direction to enhance the density benefits of DWM-SW. At the device level, we propose a new device--multilevel DWM with shift-based write (ML-DWM-SW)--that is capable of storing 2 bits in a single device. At the circuit level, we propose a ML-DWM-SW based bit-cell design and layout. The ML-DWM-SW bit-cell incurs no additional area overhead compared to the DWM-SW bit-cell despite storing an additional bit, thereby achieving roughly twice the density. However, it requires a two-step write operation and has data-dependent read and write energies, which pose unique challenges. To address these issues, we propose suitable architectural optimizations: (i) intra-word interleaving and (ii) bit encoding. We design “all-spin” cache architectures using the proposed ML-DWM-SW bit-cell for both general purpose processors as well as general purpose graphics processing units (GPGPUs). We perform an iso-capacity replacement of SRAM with spintronic memories and study the energy and area benefits at iso-performance conditions. For general purpose processors, the ML-DWM-SW cache achieves 10X reduction in energy and 4.4X reduction in cache area compared to an SRAM cache and 2X and 1.7X reduction in energy and area, respectively, compared to an STT-MRAM cache. For GPGPUs, the ML-DWM-SW cache achieves 5.3X reduction in energy and 3.6X area reduction compared to SRAM and 3.5X energy reduction and 1.9X area reduction compared to STT-MRAM.
Publisher
Association for Computing Machinery (ACM)
Subject
Electrical and Electronic Engineering,Hardware and Architecture,Software
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Fast-track cache;Proceedings of the 36th ACM International Conference on Supercomputing;2022-06-28
2. Approximate Spintronic Memories;ACM Journal on Emerging Technologies in Computing Systems;2020-10-31
3. A Survey of Techniques for Architecting Processor Components Using Domain-Wall Memory;ACM Journal on Emerging Technologies in Computing Systems;2017-04-30
4. Normally-OFF STT-MRAM Cache with Zero-Byte Compression for Energy Efficient Last-Level Caches;Proceedings of the 2016 International Symposium on Low Power Electronics and Design;2016-08-08