Towards Super Compressed Neural Networks for Object Identification: Quantized Low-Rank Tensor Decomposition with Self-Attention
-
Published:2024-04-02
Issue:7
Volume:13
Page:1330
-
ISSN:2079-9292
-
Container-title:Electronics
-
language:en
-
Short-container-title:Electronics
Author:
Liu Baichen123ORCID, Wang Dongwei123, Lv Qi4, Han Zhi12ORCID, Tang Yandong12
Affiliation:
1. State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China 2. Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169, China 3. University of Chinese Academy of Sciences, Beijing 100049, China 4. School of Mechanical Engineering, University of Science and Technology Beijing, Beijing 100083, China
Abstract
Deep convolutional neural networks have a large number of parameters and require a significant number of floating-point operations during computation, which limits their deployment in situations where the storage space is limited and computational resources are insufficient, such as in mobile phones and small robots. Many network compression methods have been proposed to address the aforementioned issues, including pruning, low-rank decomposition, quantization, etc. However, these methods typically fail to achieve a significant compression ratio in terms of the parameter count. Even when high compression rates are achieved, the network’s performance is often significantly deteriorated, making it difficult to perform tasks effectively. In this study, we propose a more compact representation for neural networks, named Quantized Low-Rank Tensor Decomposition (QLTD), to super compress deep convolutional neural networks. Firstly, we employed low-rank Tucker decomposition to compress the pre-trained weights. Subsequently, to further exploit redundancies within the core tensor and factor matrices obtained through Tucker decomposition, we employed vector quantization to partition and cluster the weights. Simultaneously, we introduced a self-attention module for each core tensor and factor matrix to enhance the training responsiveness in critical regions. The object identification results in the CIFAR10 experiment showed that QLTD achieved a compression ratio of 35.43×, with less than 1% loss in accuracy and a compression ratio of 90.61×, with less than a 2% loss in accuracy. QLTD was able to achieve a significant compression ratio in terms of the parameter count and realize a good balance between compressing parameters and maintaining identification accuracy.
Funder
National Natural Science Foundation of China CAS Project for Young Scientists in Basic Research Youth Innovation Promotion Association of the Chinese Academy of Sciences
Reference58 articles.
1. Li, J., Chen, G., Jin, M., Mao, W., and Lu, H. (2024). AE-Qdrop: Towards Accurate and Efficient Low-Bit Post-Training Quantization for a Convolutional Neural Network. Electronics, 13. 2. Smagulova, K., Bacha, L., Fouda, M.E., Kanj, R., and Eltawil, A. (2024). Robustness and Transferability of Adversarial Attacks on Different Image Classification Neural Networks. Electronics, 13. 3. Yu, C.C., Chen, T.Y., Hsu, C.W., and Cheng, H.Y. (2024). Incremental Scene Classification Using Dual Knowledge Distillation and Classifier Discrepancy on Natural and Remote Sensing Images. Electronics, 13. 4. Yang, W., Wang, X., Luo, X., Xie, S., and Chen, J. (2024). S2S-Sim: A Benchmark Dataset for Ship Cooperative 3D Object Detection. Electronics, 13. 5. Jia, L., Tian, X., Hu, Y., Jing, M., Zuo, L., and Li, W. (2024). Style-Guided Adversarial Teacher for Cross-Domain Object Detection. Electronics, 13.
|
|