Hybrid Granularities Transformer for Fine-Grained Image Recognition-Reference-Cited by-同舟云学术

Hybrid Granularities Transformer for Fine-Grained Image Recognition

Published:2023-04-01 Issue:4 Volume:25 Page:601
ISSN:1099-4300
Container-title:Entropy
language:en
Short-container-title:Entropy

Author:

Yu Ying¹,Wang Jinghui¹

Affiliation:

1. School of Software, East China Jiaotong University, Nanchang 330013, China

Abstract

Many current approaches for image classification concentrate solely on the most prominent features within an image, but in fine-grained image recognition, even subtle features can play a significant role in model classification. In addition, the large variations in the same class and small differences between different categories that are unique to fine-grained image recognition pose a great challenge for the model to extract discriminative features between different categories. Therefore, we aim to present two lightweight modules to help the network discover more detailed information in this paper. (1) Patches Hidden Integrator (PHI) module randomly selects patches from images and replaces them with patches from other images of the same class. It allows the network to glean diverse discriminative region information and prevent over-reliance on a single feature, which can lead to misclassification. Additionally, it does not increase the training time. (2) Consistency Feature Learning (CFL) aggregates patch tokens from the last layer, mining local feature information and fusing it with the class token for classification. CFL also utilizes inconsistency loss to force the network to learn common features in both tokens, thereby guiding the network to focus on salient regions. We conducted experiments on three datasets, CUB-200-2011, Stanford Dogs, and Oxford 102 Flowers. We achieved experimental results of 91.6%, 92.7%, and 99.5%, respectively, achieving a competitive performance compared to other works.

Funder

National Natural Science Foundation of China

Natural Science Foundation of Jiangxi Province

Double Thousand Plan of Jiangxi Province in China

Postgraduate Innovation Fund of Education Department of Jiangxi Province

Publisher

MDPI AG

Subject

General Physics and Astronomy

Link

https://www.mdpi.com/1099-4300/25/4/601/pdf

Reference36 articles.

1. Fine-grained image analysis with deep learning: A survey;Wei;IEEE Trans. Pattern Anal. Mach. Intell.,2022

2. Fleet, D., Pajdla, T., Schiele, B., and Tuytelaars, T. (2014). Computer Vision–ECCV 2014, Springer International Publishing.

3. Wei, X.S., Xie, C.W., and Wu, J. (2016). Mask-cnn: Localizing parts and selecting descriptors for fine-grained image recognition. arXiv.

4. Branson, S., Van Horn, G., Belongie, S., and Perona, P. (2014). Bird species categorization using pose normalized deep convolutional nets. arXiv.

5. Lin, D., Shen, X., Lu, C., and Jia, J. (2015, January 7–12). Deep LAC: Deep localization, alignment and classification for fine-grained recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multi-level information fusion Transformer with background filter for fine-grained image recognition;Applied Intelligence;2024-06-20