Towards compressed and efficient CNN architectures via pruning-Reference-Cited by-同舟云学术

Towards compressed and efficient CNN architectures via pruning

Published:2024-09-04 Issue:1 Volume:27 Page:
ISSN:2948-2992
Container-title:Discover Computing
language:en
Short-container-title:Discov Computing

Author:

Narkhede Meenal,Mahajan Shrinivas,Bartakke Prashant,Sutaone Mukul

Abstract

AbstractConvolutional Neural Networks (CNNs) use convolutional kernels to extract important low-level to high-level features from data. The performance of CNNs improves as they grow deep thereby learning better representations of the data. However, such deep CNNs are compute and memory-intensive, making deployment on resource-constrained devices challenging. To address this, the CNNs are compressed by adopting pruning strategies that remove redundant convolutional kernels from each layer while maintaining accuracy. Existing pruning methods that are based on feature map importance, only prune the convolutional layers uniformly and do not consider fully connected layers. Also, current techniques do not take into account class labels while pruning the less important feature maps and do not explore the need for retraining after pruning. This paper presents pruning techniques to prune convolutional and fully connected layers. This paper proposes a novel class-specific pruning strategy based on finding feature map importance in terms of entropy for convolutional layers and the number of incoming zeros to neurons for fully connected layers. The class-specific approach helps to have a different pruning threshold for every convolutional layer and ensures that the pruning threshold is not influenced by any particular class. A study on the need for retraining the entire network or a part of the network after pruning is also carried out. For Intel image, CIFAR10 and CIFAR100 datasets the proposed pruning method has compressed AlexNet by 83.2%, 87.19%, and 79.7%, VGG-16 by 83.7%, 85.11%, and 84.06% and ResNet-50 by 62.99%, 62.3% and 58.34% respectively.

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s10791-024-09463-4.pdf

Reference58 articles.

1. Cheng Y, Wang D, Zhou P, et al. A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282 2017.

2. Choudhary T, Mishra V, Goswami A, et al. A comprehensive survey on model compression and acceleration. Artif Intell Rev. 2020;53:5113–55.

3. Chung K, Lee C, Tsang Y, et al. Multi-objective evolutionary architectural pruning of deep convolutional neural networks with weights inheritance. Inf Sci. 2024;121265.

4. Deng T. A survey of convolutional neural networks for image classification: Models and datasets. In: 2022 international conference on big data, information and computer network (BDICN), IEEE, 2022;746–749.

5. Ding Y, Chen DR. Optimization based layer-wise pruning threshold method for accelerating convolutional neural networks. Mathematics. 2023;11(15):3311.