Affiliation:
1. Princeton University, NJ, USA
2. University of Murcia, Murcia, Spain
Abstract
In today’s computers, heterogeneous processing is used to meet performance targets at manageable power. In adopting increased compute specialization, however, the relative amount of time spent on communication increases. System and software optimizations for communication often come at the costs of increased complexity and reduced portability. The Decoupled Supply-Compute (DeSC) approach offers a way to attack communication latency bottlenecks automatically, while maintaining good portability and low complexity. Our work expands prior Decoupled Access Execute techniques with hardware/software specialization. For a range of workloads, DeSC offers roughly 2 × speedup, and additional specialized compression optimizations reduce traffic between decoupled units by 40%.
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. HardCilk: Cilk-like Task Parallelism for FPGAs;2024 IEEE 32nd Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM);2024-05-05
2. Survival of the Fastest: Enabling More Out-of-Order Execution in Dataflow Circuits;Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays;2024-04
3. A High-Frequency Load-Store Queue with Speculative Allocations for High-Level Synthesis;2023 International Conference on Field Programmable Technology (ICFPT);2023-12-12
4. An architecture interface and offload model for low-overhead, near-data, distributed accelerators;2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO);2022-10
5. GraphAttack;ACM Transactions on Architecture and Code Optimization;2021-12-31