Affiliation:
1. Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh PA
2. Intel Corporation in Portland, Oregon and Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh PA
Abstract
Since the introduction of virtual memory demand-paging and cache memories, computer systems have been exploiting spatial and temporal locality to reduce the average latency of a memory reference. In this paper, we introduce the notion of
value locality,
a third facet of locality that is frequently present in real-world programs, and describe how to effectively capture and exploit it in order to perform
load value prediction.
Temporal and spatial locality are attributes of storage locations, and describe the future likelihood of references to those locations or their close neighbors. In a similar vein,
value locality
describes the likelihood of the recurrence of a previously-seen value within a storage location. Modern processors already exploit
value locality
in a very restricted sense through the use of control speculation (i.e. branch prediction), which seeks to predict the future value of a single condition bit based on previously-seen values. Our work extends this to predict entire 32- and 64-bit register values based on previously-seen values. We find that, just as condition bits are fairly predictable on a per-static-branch basis, full register values being loaded from memory are frequently predictable as well. Furthermore, we show that simple microarchitectural enhancements to two modern microprocessor implementations (based on the PowerPC 620 and Alpha 21164) that enable
load value prediction
can effectively exploit
value locality
to collapse true dependencies, reduce average memory latency and bandwidth requirements, and provide measurable performance gains.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Cited by
97 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献