Author:
Liu Yanan,Bose Laurie,Fan Rui,Dudek Piotr,Mayol-Cuevas Walterio
Abstract
Many types of Convolutional Neural Network (CNN) models and training methods have been proposed in recent years aiming to provide efficiency for embedded and edge devices with limited computation and memory resources. The wide variety of architectures makes this a complex task that has to balance generality with efficiency. Among the most interesting camera-sensor architectures are Pixel Processor Arrays (PPAs). This study presents two methods that are useful for embedded CNNs in general but particularly suitable for PPAs. The first is for training purely binarized CNNs, the second is for deploying larger models with a model swapping paradigm that loads model components dynamically. Specifically, this study trains and implements networks with batch normalization and adaptive threshold for binary activations. Then, we convert batch normalization and binary activations into a bias matrix which can be parallelly implemented by an add/sub operation. For dynamic model swapping, we propose to decompose applications that are beyond the capacity of a PPA into sub-tasks that can be solved by tree networks that can be loaded dynamically as needed. We demonstrate our approaches to various tasks including classification, localization, and coarse segmentation on a highly resource constrained PPA sensor-processor.
Funder
UK Research and Innovation
Reference36 articles.
1. Visual odometry for pixel processor arrays,;Bose,2017
2. A camera that cnns: Towards embedded neural networks on pixel processor arrays,;Bose,2019
3. Sand castle summation for pixel processor arrays,;Bose,2021
4. Fully embedding fast convolutional networks on pixel processor arrays,;Bose,2020
5. A 100,000 fps vision sensor with embedded 535gops/w 256×256 simd processor array,;Carey,2013
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献