Affiliation:
1. University of Illinois at Urbana--Champaign, Urbana-Champaign, IL
2. University of California, Santa Cruz, Santa Cruz, CA
Abstract
Modern superscalar processors often suffer long stalls because of load misses in on-chip L2 caches. To address this problem, we propose hiding L2 misses with Checkpoint-Assisted VAlue prediction (CAVA). On an L2 cache miss, a predicted value is returned to the processor. When the missing load finally reaches the head of the ROB, the processor checkpoints its state, retires the load, and speculatively uses the predicted value and continues execution. When the value in memory arrives at the L2 cache, it is compared to the predicted value. If the prediction was correct, speculation has succeeded and execution continues; otherwise, execution is rolled back and restarted from the checkpoint. CAVA uses fast checkpointing, speculative buffering, and a modest-sized value prediction structure that has about 50% accuracy. Compared to an aggressive superscalar processor, CAVA speeds up execution by up to 1.45 for SPECint applications and 1.58 for SPECfp applications, with a geometric mean of 1.14 for SPECint and 1.34 for SPECfp applications. We also evaluate an implementation of Runahead execution---a previously proposed scheme that does not perform value prediction and discards all work done between checkpoint and data reception from memory. Runahead execution speeds up execution by a geometric mean of 1.07 for SPECint and 1.18 for SPECfp applications, compared to the same baseline.
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Reference28 articles.
1. Cooksey R. 2002. Content-sensitive data prefetching. Ph.D. thesis University of Colorado Boulder. Cooksey R. 2002. Content-sensitive data prefetching. Ph.D. thesis University of Colorado Boulder.
Cited by
17 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. A Machine Learning Based Load Value Approximator Guided by the Tightened Value Locality;Proceedings of the Great Lakes Symposium on VLSI 2023;2023-06-05
2. Efficient invisible speculative execution through selective delay and value prediction;Proceedings of the 46th International Symposium on Computer Architecture;2019-06-22
3. AVPP;ACM Transactions on Architecture and Code Optimization;2019-01-08
4. Towards Breaking the Memory Bandwidth Wall Using Approximate Value Prediction;Approximate Circuits;2018-12-06
5. Approximate Cache Architectures;Approximate Circuits;2018-12-06