Affiliation:
1. University of Cambridge, Cambridge, United Kingdom
Abstract
Many modern workloads compute on large amounts of data, often with irregular memory accesses. Current architectures perform poorly for these workloads, as existing prefetching techniques cannot capture the memory access patterns; these applications end up heavily memory-bound as a result. Although a number of techniques exist to explicitly configure a prefetcher with traversal patterns, gaining significant speedups, they do not generalise beyond their target data structures. Instead, we propose an event-triggered programmable prefetcher combining the flexibility of a general-purpose computational unit with an event-based programming model, along with compiler techniques to automatically generate events from the original source code with annotations. This allows more complex fetching decisions to be made, without needing to stall when intermediate results are required. Using our programmable prefetching system, combined with small prefetch kernels extracted from applications, we achieve an average 3.0x speedup in simulation for a variety of graph, database and HPC workloads.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Reference58 articles.
1. Graph Prefetching Using Data Structure Knowledge
2. AnandTech. http://www.anandtech.com/show/8718/the-samsung-galaxy-note-4-exynos-review/6 a. AnandTech. http://www.anandtech.com/show/8718/the-samsung-galaxy-note-4-exynos-review/6 a.
3. AnandTech. http://www.anandtech.com/show/8542/cortexm7-launches-embedded-iot-and-wearables/2 b. AnandTech. http://www.anandtech.com/show/8542/cortexm7-launches-embedded-iot-and-wearables/2 b.
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Tyche: An Efficient and General Prefetcher for Indirect Memory Accesses;ACM Transactions on Architecture and Code Optimization;2024-03-23
2. A Tensor Marshaling Unit for Sparse Tensor Algebra on General-Purpose Processors;56th Annual IEEE/ACM International Symposium on Microarchitecture;2023-10-28
3. A Survey on the Proposed Architectures for Efficient Execution of Irregular Applications Using Pipeline Parallelism;2023 Congress in Computer Science, Computer Engineering, & Applied Computing (CSCE);2023-07-24
4. Crescent;Proceedings of the 49th Annual International Symposium on Computer Architecture;2022-06-11
5. Tiny but mighty;Proceedings of the 49th Annual International Symposium on Computer Architecture;2022-06-11