Understanding Latency Variation in Modern DRAM Chips-Reference-Cited by-同舟云学术

Understanding Latency Variation in Modern DRAM Chips

Published:2016-06-30 Issue:1 Volume:44 Page:323-336
ISSN:0163-5999
Container-title:ACM SIGMETRICS Performance Evaluation Review
language:en
Short-container-title:SIGMETRICS Perform. Eval. Rev.

Author:

Chang Kevin K.¹,Kashyap Abhijith¹,Hassan Hasan²,Ghose Saugata¹,Hsieh Kevin³,Lee Donghyuk¹,Li Tianshi⁴,Pekhimenko Gennady¹,Khan Samira⁵,Mutlu Onur⁶

Affiliation:

1. Carnegie Mellon University, Pittsburgh, PA, USA

2. Carnegie Mellon University & TOBB ETU, Ankara, Turkey

3. Carnegie Mellon University, Pittsburgh, USA

4. Peking University & Carnegie Mellon University, Pittsburgh, PA, USA

5. University of Virginia, Charlottesville, VA, USA

6. ETH Zurich & Carnegie Mellon University, Pittsburgh, PA, USA

Abstract

Long DRAM latency is a critical performance bottleneck in current systems. DRAM access latency is defined by three fundamental operations that take place within the DRAM cell array: (i) activation of a memory row, which opens the row to perform accesses; (ii) precharge, which prepares the cell array for the next memory access; and (iii) restoration of the row, which restores the values of cells in the row that were destroyed due to activation. There is significant latency variation for each of these operations across the cells of a single DRAM chip due to irregularity in the manufacturing process. As a result, some cells are inherently faster to access, while others are inherently slower. Unfortunately, existing systems do not exploit this variation. The goal of this work is to (i) experimentally characterize and understand the latency variation across cells within a DRAM chip for these three fundamental DRAM operations, and (ii) develop new mechanisms that exploit our understanding of the latency variation to reliably improve performance. To this end, we comprehensively characterize 240 DRAM chips from three major vendors, and make several new observations about latency variation within DRAM. We find that (i) there is large latency variation across the cells for each of the three operations; (ii) variation characteristics exhibit significant spatial locality: slower cells are clustered in certain regions of a DRAM chip; and (iii) the three fundamental operations exhibit different reliability characteristics when the latency of each operation is reduced. Based on our observations, we propose Flexible-LatencY DRAM (FLY-DRAM), a mechanism that exploits latency variation across DRAM cells within a DRAM chip to improve system performance. The key idea of FLY-DRAM is to exploit the spatial locality of slower cells within DRAM, and access the faster DRAM regions with reduced latencies for the fundamental operations. Our evaluations show that FLY-DRAM improves the performance of a wide range of applications by 13.3%, 17.6%, and 19.5%, on average, for each of the three different vendors' real DRAM chips, in a simulated 8-core system. We conclude that the experimental characterization and analysis of latency variation within modern DRAM, provided by this work, can lead to new techniques that improve DRAM and system performance.

Funder

National Science Foundation

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Hardware and Architecture,Software

Link

https://dl.acm.org/doi/pdf/10.1145/2964791.2901453

Reference83 articles.

1. N. Agarwal phet al. "Page Placement Strategies for GPUs Within Heterogeneous Memory Systems " in ASPLOS 2015. 10.1145/2694344.2694381 N. Agarwal phet al. "Page Placement Strategies for GPUs Within Heterogeneous Memory Systems " in ASPLOS 2015. 10.1145/2694344.2694381

2. H. Bauer etal "Memory: Are Challenges ahead?" March 2016. Available: http://www.mckinsey.com/industries/semiconductors/our-insights/memory-are-challenges-ahead H. Bauer et al. "Memory: Are Challenges ahead?" March 2016. Available: http://www.mckinsey.com/industries/semiconductors/our-insights/memory-are-challenges-ahead

3. Thread criticality predictors for dynamic performance, power, and resource management in chip multiprocessors

4. B. H. Bloom "Space/Time Tradeoffs in Hash Coding with Allowable Errors " CACM July 1970. 10.1145/362686.362692 B. H. Bloom "Space/Time Tradeoffs in Hash Coding with Allowable Errors " CACM July 1970. 10.1145/362686.362692

5. K. Chakraborty and P. Mazumder Fault-Tolerance and Reliability Techniques for High-Density Random-Access Memories.\hskip 1em plus 0.5em minus 0.4em\relax Prentice Hall 2002. K. Chakraborty and P. Mazumder Fault-Tolerance and Reliability Techniques for High-Density Random-Access Memories.\hskip 1em plus 0.5em minus 0.4em\relax Prentice Hall 2002.

Cited by 18 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Full-Stack Revision of Memory and Data Management in PDES on Multi-Core Machines;Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing;2024-06-03

2. Spatial Variation-Aware Read Disturbance Defenses: Experimental Analysis of Real DRAM Chips and Implications on Future Solutions;2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA);2024-03-02

3. Functionally-Complete Boolean Logic in Real DRAM Chips: Experimental Characterization and Analysis;2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA);2024-03-02

4. iNUMAlloc: Towards Intelligent Memory Allocation for AI Accelerators with NUMA;2023 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom);2023-12-21

5. Mitigation of Rowhammer Attack on DDR4 Memory: A Novel Multi-Table Frequent Element Algorithm Based Approach;2023 IEEE 66th International Midwest Symposium on Circuits and Systems (MWSCAS);2023-08-06