Affiliation:
1. University of Illinois at Urbana-Champaign
Abstract
We optimize a visual object detection application (that uses Vision Video Library kernels) and show that OpenCL is a unified programming paradigm that can provide high performance when running on the Ivy Bridge heterogeneous on-chip architecture. We evaluate different mapping techniques and show that running each kernel where it fits the best and using software pipelining can provide 1.91 times higher performance and 42% better energy efficiency. We also show how to trade accuracy for energy at runtime. Overall, our application can perform accurate object detection at 40 frames per second (fps) in an energy-efficient manner.
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. A black-box approach to energy-aware scheduling on integrated CPU-GPU systems;Proceedings of the 2016 International Symposium on Code Generation and Optimization;2016-02-29
2. A Two-Level Task Scheduler on Multiple DSP System for OpenCL;Advances in Mechanical Engineering;2014-01-01