Affiliation:
1. University of California, Los Angeles, Los Angeles, CA and University of Wisconsin - Madison
2. University of Wisconsin - Madison, Madison, WI
Abstract
General-purpose processors (GPPs), which traditionally rely on a Von Neumann-based execution model, incur burdensome power overheads, largely due to the need to dynamically extract parallelism and maintain precise state. Further, it is extremely difficult to improve their performance without increasing energy usage. Decades-old
explicit-dataflow
architectures eliminate many Von Neumann overheads, but have not been successful as stand-alone alternatives because of poor performance on certain workloads, due to insufficient control speculation and communication overheads.
We observe a synergy between out-of-order (OOO) and explicit-dataflow processors, whereby dynamically switching between them according to the behavior of program phases can greatly improve performance and energy efficiency. This work studies the potential of such a paradigm of heterogeneous execution models, by developing a specialization engine for explicit-dataflow (SEED) and integrating it with a standard out-of-order (OOO) core. When integrated with a dual-issue OOO, it becomes both faster (1.33x) and dramatically more energy efficient (1.70x). Integrated with an in-order core, it becomes faster than even a dual-issue OOO, with twice the energy efficiency.
Publisher
Association for Computing Machinery (ACM)
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Klotski: DNN Model Orchestration Framework for Dataflow Architecture Accelerators;2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD);2023-10-28
2. Clockhands: Rename-free Instruction Set Architecture for Out-of-order Processors;56th Annual IEEE/ACM International Symposium on Microarchitecture;2023-10-28
3. A Loop Optimization Method for Dataflow Architecture;2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys);2022-12
4. Accelerator-level parallelism;Communications of the ACM;2021-12
5. DRT: A Lightweight Runtime for Developing Benchmarks for a Dataflow Execution Model;Architecture of Computing Systems;2021