Affiliation:
1. Purdue University, West Lafayette, IN
Abstract
Deep-learning neural networks have proven to be very successful for a wide range of recognition tasks across modern computing platforms. However, the computational requirements associated with such deep nets can be quite high, and hence their energy-efficient implementation is of great interest. Although, traditionally, the entire network is utilized for the recognition of all inputs, we observe that the classification difficulty varies widely across inputs in real-world datasets; only a small fraction of inputs requires the full computational effort of a network, while a large majority can be classified correctly with very low effort. In this article, we propose Conditional Deep Learning (CDL), where the convolutional layer features are used to identify the variability in the difficulty of input instances and conditionally activate the deeper layers of the network. We achieve this by cascading a linear network of output neurons for each convolutional layer and monitoring the output of the linear network to decide whether classification can be terminated at the current stage or not. The proposed methodology thus enables the network to dynamically adjust the computational effort depending on the difficulty of the input data while maintaining competitive classification accuracy. The overall energy benefits for MNIST/CIFAR10/Tiny ImageNet datasets with state-of-the-art deep-learning architectures are 1.84 × /2.83 × /4.02 × , respectively. We further employ the conditional approach to train deep-learning networks from scratch with integrated supervision from the additional output neurons appended at the intermediate convolutional layers. Our proposed integrated CDL training leads to an improvement in the gradient convergence behavior giving substantial error rate reduction on MNIST/CIFAR-10, resulting in improved classification over state-of-the-art baseline networks.
Funder
C-SPIN
National Science Foundation
a Semiconductor Research Corporation Program
MARCO and DARPA
one of the six centers of STARnet
Intel Corporation
Vannevar Bush Faculty Fellowship
Publisher
Association for Computing Machinery (ACM)
Subject
Electrical and Electronic Engineering,Hardware and Architecture,Software
Reference31 articles.
1. Learning Deep Architectures for AI
2. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition
3. Jeffrey Dean Greg Corrado Rajat Monga Kai Chen Matthieu Devin Mark Mao Andrew Senior Paul Tucker Ke Yang Quoc V. Le and others. 2012. Large scale distributed deep networks. In Advances in Neural Information Processing Systems. 1223--1231. Jeffrey Dean Greg Corrado Rajat Monga Kai Chen Matthieu Devin Mark Mao Andrew Senior Paul Tucker Ke Yang Quoc V. Le and others. 2012. Large scale distributed deep networks. In Advances in Neural Information Processing Systems. 1223--1231.
Cited by
36 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献