Improved Fine-Grained Image Classification in Few-Shot Learning Based on Channel-Spatial Attention and Grouped Bilinear Convolution-Reference-Cited by-同舟云学术

Improved Fine-Grained Image Classification in Few-Shot Learning Based on Channel-Spatial Attention and Grouped Bilinear Convolution

Published:2024-07-23 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Zeng Ziwei¹,Li Lihong¹,Zhao Zoufei¹,Liu Qingqing¹

Affiliation:

1. Hebei University of Engineering

Abstract

In the context of the complexities of fine-grained image classification intertwined with the constraints of few-shot learning, this paper focuses on overcoming the challenges posed by subtle inter-class differences. To enhance the model's capability to recognize key visual patterns, such as eyes and beaks, this research ingeniously integrates spatial and channel attention mechanisms along with grouped bilinear convolution techniques to adapt to the few-shot learning environment. Specifically, a novel neural network architecture is designed that integrates channel and spatial information, and interactively applies these two types of information to collaboratively optimize the weights of channel and spatial attention. Additionally, to further explore the complex dependencies among features, a grouped bilinear convolution strategy is introduced. This algorithm divides the weighted feature maps into multiple independent groups, where bilinear operations are performed within each group. This strategy captures higher-order feature interactions while reducing network parameters. Comprehensive experiments conducted on three fine-grained benchmark datasets for two few-shot tasks demonstrate the superiority of our algorithm in handling fine-grained features. Notably, in the experiments on the Stanford Cars dataset, a classification accuracy of 95.42% was achieved, confirming its effectiveness and applicability in few shot learning scenarios. Codes are available at: https://github.com/204503zzw/atb.

Publisher

Springer Science and Business Media LLC

Reference51 articles.

1. Self-reconstruction network for fine-grained few-shot classification;Li XX;Pattern Recognition,2024

2. Yang, L.F., Li, X., Song, R.J., Zhao, B.R., Tao, J.T., Zhou, S.H., Liang, J.J., Yang, J.: Dynamic mlp for fine-grained image classification by leveraging geographical and temporal information. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10945–10954 (2022)

3. Jiang, J.J., Chen, Z.W., Lei, F.Y., Xu, L., Huang, J.H., Yuan,X.C.: Multi-Granularity Hypergraph Enhanced Hierarchical Neural Network Framework for Visual Classification. The Visual Computer (2024)

4. Revisiting Local and Global Descriptor-Based Metric Network for Few-Shot SAR Target Classification;Zheng J;IEEE Transactions on Geoscience and Remote Sensing,2024

5. Disentangled feature representation for few-shot image classification;Cheng H;IEEE Transactions on Neural Networks and Learning Systems,2023