Affiliation:
1. School of Electrical, Computer and Biomedical Engineering, Southern Illinois University, Carbondale, IL 62901, USA
Abstract
Deep Neural Networks (DNNs) have achieved impressive performance in various image recognition tasks, but their large model sizes make them challenging to deploy on resource-constrained devices. In this paper, we propose a dynamic DNN pruning approach that takes into account the difficulty of the incoming images during inference. To evaluate the effectiveness of our method, we conducted experiments on the ImageNet dataset on several state-of-art DNNs. Our results show that the proposed approach reduces the model size and amount of DNN operations without the need to retrain or fine-tune the pruned model. Overall, our method provides a promising direction for designing efficient frameworks for lightweight DNN models that can adapt to the varying complexity of input images.
Funder
Consortium for Embedded Systems at SIUC
Subject
Electrical and Electronic Engineering,Mechanical Engineering,Control and Systems Engineering
Reference27 articles.
1. Spantidi, O., Zervakis, G., Anagnostopoulos, I., Amrouch, H., and Henkel, J. (2021, January 1–4). Positive/negative approximate multipliers for DNN accelerators. Proceedings of the IEEE/ACM International Conference On Computer Aided Design (ICCAD), Munich, Germany.
2. Spantidi, O., and Anagnostopoulos, I. (2022, January 6–7). How much is too much error? Analyzing the impact of approximate multipliers on DNNs. Proceedings of the 23rd International Symposium on Quality Electronic Design (ISQED), Virtual Event.
3. NPU Thermal Management;Amrouch;IEEE Trans. Comput. Aided Des. Integr. Circuits Syst.,2020
4. Yang, H., Gui, S., Zhu, Y., and Liu, J. (2020, January 14–19). Automatic neural network compression by sparsity-quantization joint learning: A constrained optimization-based approach. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual Event.
5. Han, S., Han, S., Mao, H., and Dally, W.J. (2015). Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献