Affiliation:
1. Computer Laboratory, University of Cambridge, Cambridge, UK
Abstract
Although custom (and reconfigurable) computing can provide orders-of-magnitude improvements in energy efficiency and performance for many numeric, data-parallel applications, performance on nonnumeric, sequential code is often worse than conventional superscalar processors. This work attempts to improve sequential performance in custom hardware by (a) switching from a statically scheduled to a dynamically scheduled (dataflow) execution model and (b) developing a new compiler IR for high-level synthesis—the value state flow graph (VSFG)—that enables aggressive exposition of ILP even in the presence of complex control flow. Compared to existing control-data flow graph (CDFG)-based IRs, the VSFG exposes more instruction-level parallelism from control-intensive sequential code by exploiting aggressive speculation, enabling control dependence analysis, as well as execution along multiple flows of control. This new IR is directly implemented as a static-dataflow graph in hardware by our prototype high-level synthesis tool chain and shows an average speedup of 1.13× over equivalent hardware generated using LegUp, an existing CDFG-based HLS tool. Furthermore, the VSFG allows us to further trade area and energy for performance through loop unrolling, increasing the average speedup to 1.55×, with a peak speedup of 4.05×. Our VSFG-based hardware approaches the sequential cycle counts of an Intel Nehalem Core i7 processor while consuming only 0.25× the energy of an in-order Altera Nios II
f
processor.
Funder
C3D:Communication Centric Computer Design
UK EPSRC
Publisher
Association for Computing Machinery (ACM)
Reference41 articles.
1. Evolution of thread-level parallelism in desktop applications
2. Mihai Budiu. 2003. Spatial Computation. Ph.D. Dissertation. Computer Science Department Carnegie Mellon University Pittsburgh PA. http://www.cs.cmu.edu/∼mihaib/research/thesis.pdf Technical report CMU-CS-03-217 Mihai Budiu. 2003. Spatial Computation. Ph.D. Dissertation. Computer Science Department Carnegie Mellon University Pittsburgh PA. http://www.cs.cmu.edu/∼mihaib/research/thesis.pdf Technical report CMU-CS-03-217
3. Dataflow: A Complement to Superscalar
4. Spatial computation
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Hardware Reusability Optimization for High-Level Synthesis of Component-Based Processors;2022 11th International Conference on Communications, Circuits and Systems (ICCCAS);2022-05-13
2. A decoupled access-execute architecture for reconfigurable accelerators;Proceedings of the 15th ACM International Conference on Computing Frontiers;2018-05-08