NTCE-KD: Non-Target-Class-Enhanced Knowledge Distillation-Reference-Cited by-同舟云学术

NTCE-KD: Non-Target-Class-Enhanced Knowledge Distillation

Published:2024-06-03 Issue:11 Volume:24 Page:3617
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Li Chuan¹^ORCID,Teng Xiao¹^ORCID,Ding Yan¹^ORCID,Lan Long¹^ORCID

Affiliation:

1. College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China

Abstract

Most logit-based knowledge distillation methods transfer soft labels from the teacher model to the student model via Kullback–Leibler divergence based on softmax, an exponential normalization function. However, this exponential nature of softmax tends to prioritize the largest class (target class) while neglecting smaller ones (non-target classes), leading to an oversight of the non-target classes’s significance. To address this issue, we propose Non-Target-Class-Enhanced Knowledge Distillation (NTCE-KD) to amplify the role of non-target classes both in terms of magnitude and diversity. Specifically, we present a magnitude-enhanced Kullback–Leibler (MKL) divergence multi-shrinking the target class to enhance the impact of non-target classes in terms of magnitude. Additionally, to enrich the diversity of non-target classes, we introduce a diversity-based data augmentation strategy (DDA), further enhancing overall performance. Extensive experimental results on the CIFAR-100 and ImageNet-1k datasets demonstrate that non-target classes are of great significance and that our method achieves state-of-the-art performance across a wide range of teacher–student pairs.

Funder

National Natural Science Foundation of China

Publisher

MDPI AG

Link

https://www.mdpi.com/1424-8220/24/11/3617/pdf

Reference46 articles.

1. He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27–30). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision And Pattern Recognition, Las Vegas, NV, USA.

2. Ma, N., Zhang, X., Zheng, H.T., and Sun, J. (2018, January 8–14). Shufflenet v2: Practical guidelines for efficient cnn architecture design. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.

3. Hu, J., Shen, L., and Sun, G. (2018, January 18–23). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision And Pattern Recognition, Salt Lake City, UT, USA.

4. Faster r-cnn: Towards real-time object detection with region proposal networks;Ren;Adv. Neural Inf. Process. Syst.,2015

5. He, K., Gkioxari, G., Dollár, P., and Girshick, R. (2017, January 22–29). Mask r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Research on defect recognition technology of transmission line based on visual macromodeling;Applied Mathematics and Nonlinear Sciences;2024-01-01