Towards an Efficient CNN Inference Architecture Enabling In-Sensor Processing-Reference-Cited by-同舟云学术

Towards an Efficient CNN Inference Architecture Enabling In-Sensor Processing

Published:2021-03-10 Issue:6 Volume:21 Page:1955
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Pantho Md Jubaer Hossain^ORCID,Bhowmik Pankaj^ORCID,Bobda Christophe^ORCID

Abstract

The astounding development of optical sensing imaging technology, coupled with the impressive improvements in machine learning algorithms, has increased our ability to understand and extract information from scenic events. In most cases, Convolution neural networks (CNNs) are largely adopted to infer knowledge due to their surprising success in automation, surveillance, and many other application domains. However, the convolution operations’ overwhelming computation demand has somewhat limited their use in remote sensing edge devices. In these platforms, real-time processing remains a challenging task due to the tight constraints on resources and power. Here, the transfer and processing of non-relevant image pixels act as a bottleneck on the entire system. It is possible to overcome this bottleneck by exploiting the high bandwidth available at the sensor interface by designing a CNN inference architecture near the sensor. This paper presents an attention-based pixel processing architecture to facilitate the CNN inference near the image sensor. We propose an efficient computation method to reduce the dynamic power by decreasing the overall computation of the convolution operations. The proposed method reduces redundancies by using a hierarchical optimization approach. The approach minimizes power consumption for convolution operations by exploiting the Spatio-temporal redundancies found in the incoming feature maps and performs computations only on selected regions based on their relevance score. The proposed design addresses problems related to the mapping of computations onto an array of processing elements (PEs) and introduces a suitable network structure for communication. The PEs are highly optimized to provide low latency and power for CNN applications. While designing the model, we exploit the concepts of biological vision systems to reduce computation and energy. We prototype the model in a Virtex UltraScale+ FPGA and implement it in Application Specific Integrated Circuit (ASIC) using the TSMC 90nm technology library. The results suggest that the proposed architecture significantly reduces dynamic power consumption and achieves high-speed up surpassing existing embedded processors’ computational capabilities.

Funder

National Science Foundation

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/21/6/1955/pdf

Reference49 articles.

1. Efficient Processing of Deep Neural Networks: A Tutorial and Survey

2. Distributed Embedded Smart Cameras: Architectures, Design and Applications;Bobda,2014

3. Deep Learning for Computer Vision: A Brief Review

Cited by 10 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Dynamic Neural Architecture Search for Image Classification;Proceedings of the Genetic and Evolutionary Computation Conference Companion;2024-07-14

2. SLIM-Net: Rethinking how neural networks use systolic arrays;2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS);2023-06-11

3. Performance–energy trade-offs of deep learning convolution algorithms on ARM processors;The Journal of Supercomputing;2023-01-21

4. Comparison of Supervised Learning Algorithms for Quality Assessment of Wearable Electrocardiograms With Paroxysmal Atrial Fibrillation;IEEE Access;2023

5. Towards a component-based acceleration of convolutional neural networks on FPGAs;Journal of Parallel and Distributed Computing;2022-09