Abstract
Nowadays, there is a tradeoff between the deep-learning module-compression ratio and the module accuracy. In this paper, a strategy for refining the pruning quantification and weights based on neural network filters is proposed. Firstly, filters in the neural network were refined into strip-like filter strips. Then, the evaluation of the filter strips was used to refine the partial importance of the filter, cut off the unimportant filter strips and reorganize the remaining filter strips. Finally, the training of the neural network after recombination was quantified to further compress the computational amount of the neural network. The results show that the method can significantly reduce the computational effort of the neural network and compress the number of parameters in the model. Based on experimental results on ResNet56, this method can reduce the number of parameters to 1/4 and the amount of calculation to 1/5, and the loss of model accuracy is only 0.01. On VGG16, the number of parameters is reduced to 1/14, the amount of calculation is reduced to 1/3, and the accuracy loss is 0.5%.
Funder
Hubei Provincial Department of Education
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献