Affiliation:
1. University of Wisconsin–Madison, Madison, Wisconsin, USA
Abstract
We propose a framework for Class-aware Personalized Neural Network Inference (CAP’NN), which prunes an already-trained neural network model based on the preferences of individual users. Specifically, by adapting to the subset of output classes that each user is expected to encounter, CAP’NN is able to prune not only ineffectual neurons but also
miseffectual
neurons that confuse classification, without the need to retrain the network. CAP’NN also exploits the similarities among pruning requests from different users to minimize the timing overheads of pruning the network. To achieve this, we propose a clustering algorithm that groups similar classes in the network based on the firing rates of neurons for each class and then implement a lightweight cache architecture to store and reuse information from previously pruned networks. In our experiments with VGG-16, AlexNet, and ResNet-152 networks, CAP’NN achieves, on average, up to 47% model size reduction while actually
improving
the top-1(5) classification accuracy by up to 3.9%(3.4%) when the user only encounters a subset of the trained classes in these networks.
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Software
Reference30 articles.
1. SLICC: Self-Assembly of Instruction Cache Collectives for OLTP Workloads
2. A clustering technique based on Elbow method and k-means in WSN;Bholowalia P.;Int. J. Comput. Appl.,2014
3. Understanding the limitations of existing energy-efficient design approaches for deep neural networks;Chen Yu-Hsin;Energy,2018
4. Context-Aware Convolutional Neural Network over Distributed System in Collaborative Computing
5. Y. Le Cun, J. S. Denker, and S. A. Solla. 1990. Optimal brain damage. In Proceedings of the Conference on Neural Information Processing Systems. 598–605.