Affiliation:
1. Purdue University
2. National Taipei University of Technology, Taiwan, Republic of China
3. Loyola University Chicago, Chicago, IL
Abstract
Embedded devices are generally small, battery-powered computers with limited hardware resources. It is difficult to run deep neural networks (DNNs) on these devices, because DNNs perform millions of operations and consume significant amounts of energy. Prior research has shown that a considerable number of a DNN’s memory accesses and computation are redundant when performing tasks like image classification. To reduce this redundancy and thereby reduce the energy consumption of DNNs, we introduce the Modular Neural Network Tree architecture. Instead of using one large DNN for the classifier, this architecture uses multiple smaller DNNs (called
modules
) to progressively classify images into groups of categories based on a novel visual similarity metric. Once a group of categories is selected by a module, another module then continues to distinguish among the similar categories within the selected group. This process is repeated over multiple modules until we are left with a single category. The computation needed to distinguish dissimilar groups is avoided, thus reducing redundant operations, memory accesses, and energy. Experimental results using several image datasets reveal the effectiveness of our proposed solution to reduce memory requirements by 50% to 99%, inference time by 55% to 95%, energy consumption by 52% to 94%, and the number of operations by 15% to 99% when compared with existing DNN architectures, running on two different embedded systems: Raspberry Pi 3 and Raspberry Pi Zero.
Funder
National Science Foundation
Publisher
Association for Computing Machinery (ACM)
Subject
Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Science Applications
Reference95 articles.
1. S. Han etal 2015. Deep compression: Compressing deep neural networks with pruning trained quantization and Huffman coding. ArXiv:1510.00149 [cs]. S. Han et al. 2015. Deep compression: Compressing deep neural networks with pruning trained quantization and Huffman coding. ArXiv:1510.00149 [cs].
2. Thermal-Sensor-Based Occupancy Detection for Smart Buildings Using Machine-Learning Methods
3. Distributed machine learning on smart-gateway network toward real-time smart-grid energy management with behavior cognition;Huang H.;ACM Transactions on Design Automation of Electronic Systems,2018
4. Camera Placement Meeting Restrictions of Computer Vision
Cited by
15 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献