Affiliation:
1. Intel Corporation, Hillsboro, OR
Abstract
Large instruction window processors achieve high performance by exposing large amounts of instruction level parallelism. However, accessing large hardware structures typically required to buffer and process such instruction window sizes significantly degrade the cycle time. This paper proposes a novel checkpoint processing and recovery (CPR) microarchitecture, and shows how to implement a large instruction window processor without requiring large structures thus permitting a high clock frequency.We focus on four critical aspects of a microarchitecture: (1) scheduling instructions, (2) recovering from branch mispredicts, (3) buffering a large number of stores and forwarding data from stores to any dependent load, and (4) reclaiming physical registers. While scheduling window size is important, we show the performance of large instruction windows to be more sensitive to the other three design issues. Our CPR proposal incorporates novel microarchitectural schemes for addressing these design issues---a selective checkpoint mechanism for recovering from mispredicts, a hierarchical store queue organization for fast store-load forwarding, and an effective algorithm for aggressive physical register reclamation. Our proposals allow a processor to realize performance gains due to instruction windows of thousands of instructions without requiring large cycle-critical hardware structures.
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Tuning the continual flow pipeline architecture with virtual register renaming;ACM Transactions on Architecture and Code Optimization;2014-02
2. Virtual Register Renaming;Architecture of Computing Systems – ARCS 2013;2013
3. Achieving reliable system performance by fast recovery of branch miss prediction;Journal of Network and Computer Applications;2012-05
4. A New Recovery Mechanism in Superscalar Microprocessors by Recovering Critical Misprediction;IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences;2011
5. Checkpoint allocation and release;ACM Transactions on Architecture and Code Optimization;2009-09