Abstract
Cameras are widely adopted for high image quality with the rapid advancement of complementary metal-oxide-semiconductor (CMOS) image sensors while offloading vision applications’ computation to the cloud. It raises concern for time-critical applications such as autonomous driving, surveillance, and defense systems since moving pixels from the sensor’s focal plane are expensive. This paper presents a hardware architecture for smart cameras that understands the salient regions from an image frame and then performs high-level inference computation for sensor-level information creation instead of transporting raw pixels. A visual attention-oriented computational strategy helps to filter a significant amount of redundant spatiotemporal data collected at the focal plane. A computationally expensive learning model is then applied to the interesting regions of the image. The hierarchical processing in the pixels’ data path demonstrates a bottom-up architecture with massive parallelism and gives high throughput by exploiting the large bandwidth available at the image source. We prototype the model in field-programmable gate array (FPGA) and application-specific integrated circuit (ASIC) for integrating with a pixel-parallel image sensor. The experiment results show that our approach achieves significant speedup while in certain conditions exhibits up to 45% more energy efficiency with the attention-oriented processing. Although there is an area overhead for inheriting attention-oriented processing, the achieved performance based on energy consumption, latency, and memory utilization overcomes that limitation.
Funder
National Science Foundation
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献