Affiliation:
1. Stanford University, Stanford, California
2. University of California, Berkeley, Berkeley, California
Abstract
Specialized image processing accelerators are necessary to deliver the performance and energy efficiency required by important applications in computer vision, computational photography, and augmented reality. But creating, “programming,” and integrating this hardware into a hardware/software system is difficult. We address this problem by extending the image processing language
Halide
so users can specify which portions of their applications should become hardware accelerators, and then we provide a compiler that uses this code to automatically create the accelerator along with the “glue” code needed for the user’s application to access this hardware. Starting with Halide not only provides a very high-level functional description of the hardware but also allows our compiler to generate a complete software application, which accesses the hardware for acceleration when appropriate. Our system also provides high-level semantics to explore different mappings of applications to a heterogeneous system, including the flexibility of being able to change the throughput rate of the generated hardware.
We demonstrate our approach by mapping applications to a commercial Xilinx Zynq system. Using its FPGA with two low-power ARM cores, our design achieves up to 6× higher performance and 38× lower energy compared to the quad-core ARM CPU on an NVIDIA Tegra K1, and 3.5× higher performance with 12× lower energy compared to the K1’s 192-core GPU.
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Reference59 articles.
1. The Frankencamera
2. Altera. 2016. Intel FPGA SDK for OpenCL. Retrieved from https://www.altera.com/products/design-software/embedded-software-developers/opencl/overview.html. Altera. 2016. Intel FPGA SDK for OpenCL. Retrieved from https://www.altera.com/products/design-software/embedded-software-developers/opencl/overview.html.
3. Real-time image processing on a custom computing platform
4. A compiler and runtime for heterogeneous computing
5. LegUp
Cited by
79 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献