Compression of Deep-Learning Models Through Global Weight Pruning Using Alternating Direction Method of Multipliers-Reference-Cited by-同舟云学术

Compression of Deep-Learning Models Through Global Weight Pruning Using Alternating Direction Method of Multipliers

Published:2023-02-24 Issue:1 Volume:16 Page:
ISSN:1875-6883
Container-title:International Journal of Computational Intelligence Systems
language:en
Short-container-title:Int J Comput Intell Syst

Author:

Lee Kichun,Hwangbo Sunghun,Yang Dongwook,Lee Geonseok^ORCID

Abstract

AbstractDeep learning has shown excellent performance in numerous machine-learning tasks, but one practical obstacle in deep learning is that the amount of computation and required memory is huge. Model compression, especially in deep learning, is very useful because it saves memory and reduces storage size while maintaining model performance. Model compression in a layered network structure aims to reduce the number of edges by pruning weights that are deemed unnecessary during the calculation. However, existing weight pruning methods perform a layer-by-layer reduction, which requires a predefined removal-ratio constraint for each layer. Layer-by-layer removal ratios must be structurally specified depending on the task, causing a sharp increase in the training time due to a large number of tuning parameters. Thus, such a layer-by-layer strategy is hardly feasible for deep layered models. Our proposed method aims to perform weight pruning in a deep layered network, while producing similar performance, by setting a global removal ratio for the entire model without prior knowledge of the structural characteristics. Our experiments with the proposed method show reliable and high-quality performance, obviating layer-by-layer removal ratios. Furthermore, experiments with increasing layers yield a pattern in the pruned weights that could provide an insight into the layers’ structural importance. The experiment with the LeNet-5 model using MNIST data results in a higher compression ratio of 98.8% for the proposed method, outperforming existing pruning algorithms. In the Resnet-56 experiment, the performance change according to removal ratios of 10–90% is investigated, and a higher removal ratio is achieved compared to other tested models. We also demonstrate the effectiveness of the proposed method with YOLOv4, a real-life object-detection model requiring substantial computation.

Funder

Ministry of Trade, Industry and Energy

Publisher

Springer Science and Business Media LLC

Subject

Computational Mathematics,General Computer Science

Link

https://link.springer.com/content/pdf/10.1007/s44196-023-00202-z.pdf

Reference32 articles.

1. Brzezinski, D., Stefanowski, J.: Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Transac. Neural Netw. Learn. Syst. 25(1), 81–94 (2013)

2. Guo, H., Liu, H., Li, R., Changan, W., Guo, Y., Mingliang, X.: Margin & diversity based ordering ensemble pruning. Neurocomputing 275, 237–246 (2018)

3. Petchrompo, S., Coit, D.W., Brintrup, A., Wannakrairot, A., Parlikad, A.K.: A review of Pareto pruning methods for multi-objective optimization. Computers Ind. Eng. 19, 108022 (2022)

4. Goel, K., Batra, S.: Two-level pruning based ensemble with abstained learners for concept drift in data streams. Expert. Syst. 38(3), e12661 (2021)

5. Deng, L., Li, G., Han, S., Shi, L., Xie, Y.: Model compression and hardware acceleration for neural networks: a comprehensive survey. Proc. IEEE 108(4), 485–532 (2020)

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A comprehensive review of model compression techniques in machine learning;Applied Intelligence;2024-09-02