Affiliation:
1. Norwegian University of Science and Technology (NTNU), Trondheim, Norway
2. Ericsson Research, Mobilvägen, Lund, Sweden
3. Uppsala University, Uppsala, Sweden
Abstract
Exploiting memory-level parallelism (MLP) is crucial to hide long memory and last-level cache access latencies. While out-of-order (OoO) cores, and techniques building on them, are effective at exploiting MLP, they deliver poor energy efficiency due to their complex and energy-hungry hardware. This work revisits slice-out-of-order (sOoO) cores as an energy-efficient alternative for MLP exploitation. sOoO cores achieve energy efficiency by constructing and executing
slices
of MLP-generating instructions out-of-order only with respect to the rest of instructions; the slices and the remaining instructions, by themselves, execute in-order. However, we observe that existing sOoO cores miss significant MLP opportunities due to their dependence-oblivious in-order slice execution, which causes dependent slices to frequently block MLP generation. To boost MLP generation, we introduce Freeway, a sOoO core based on a new dependence-aware slice execution policy that tracks dependent slices and keeps them from blocking subsequent independent slices and MLP extraction. The proposed core incurs minimal area and power overheads, yet approaches the MLP benefits of fully OoO cores. Our evaluation shows that Freeway delivers 12% better performance than the state-of-the-art sOoO core and is within 7% of the MLP limits of full OoO execution.
Funder
Knut and Alice Wallenberg Foundation through the Wallenberg Academy Fellows Program
European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program
Research Council of Norway
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software