Informing memory operations-Reference-Cited by-同舟云学术

Informing memory operations

Published:1996-05 Issue:2 Volume:24 Page:260-270
ISSN:0163-5964
Container-title:ACM SIGARCH Computer Architecture News
language:en
Short-container-title:SIGARCH Comput. Archit. News

Author:

Horowitz Mark¹,Martonosi Margaret²,Mowry Todd C.³,Smith Michael D.⁴

Affiliation:

1. Computer Systems, Laboratory, Stanford University

2. Department of Electrical Engineering, Princeton University

3. Department of Electrical and Computer Engineering, University of Toronto

4. Division of Applied Sciences, Harvard University

Abstract

Memory latency is an important bottleneck in system performance that cannot be adequately solved by hardware alone. Several promising software techniques have been shown to address this problem successfully in specific situations. However, the generality of these software approaches has been limited because current architectures do not provide a fine-grained, low-overhead mechanism for observing and reacting to memory behavior directly. To fill this need, we propose a new class of memory operations called informing memory operations, which essentially consist of a memory operation combined (either implicitly or explicitly) with a conditional branch-and-link operation that is taken only if the reference suffers a cache miss. We describe two different implementations of informing memory operations---one based on a cache-outcome condition code and another based on low-overhead traps---and find that modern in-order-issue and out-of-order-issue superscalar processors already contain the bulk of the necessary hardware support. We describe how a number of software-based memory optimizations can exploit informing memory operations to enhance performance, and look at cache coherence with fine-grained access control as a case study. Our performance results demonstrate that the runtime overhead of invoking the informing mechanism on the Alpha 21164 and MIPS R10000 processors is generally small enough to provide considerable flexibility to hardware and software designers, and that the cache coherence application has improved performance compared to other current solutions. We believe that the inclusion of informing memory operations in future processors may spur even more innovative performance optimizations.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/232974.233000

Reference35 articles.

1. The MIT Alewife machine

2. The Tera computer system

3. A. Agarwal J. Kubiatowicz D. Kranz et al. Sparcle: An Evolutionary Processor Design for Large-Scale Multiprocessors. 1EEE Micro pp 48-61 june 1993. 10.1109/40.216748 A. Agarwal J. Kubiatowicz D. Kranz et al. Sparcle: An Evolutionary Processor Design for Large-Scale Multiprocessors. 1EEE Micro pp 48-61 june 1993. 10.1109/40.216748

4. Automatic program transformations for virtual memory computers *

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. VMT: Virtualized Multi-Threading for Accelerating Graph Workloads on Commodity Processors;IEEE Transactions on Computers;2021

2. Bridging the Latency Gap between NVM and DRAM for Latency-bound Operations;Proceedings of the 15th International Workshop on Data Management on New Hardware - DaMoN'19;2019

3. SWOOP: software-hardware co-design for non-speculative, execute-ahead, in-order cores;Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation;2018-06-11

4. Transcending Hardware Limits with Software Out-of-Order Processing;IEEE Computer Architecture Letters;2017-07-01

5. Random Fill Cache Architecture;2014 47th Annual IEEE/ACM International Symposium on Microarchitecture;2014-12