Abstract
Fine-grained image classification is a challenging problem because of its large intra-class differences and low inter-class variance. Bilinear pooling based models have been shown to be effective at fine-grained classification, while most previous approaches neglect the fact that distinctive features or modeling distinguishing regions usually have an important role in solving the fine-grained problem. In this paper, we propose a novel convolutional neural network framework, i.e., attention bilinear pooling, for fine-grained classification with attention. This framework can learn the distinctive feature information from the channel or spatial attention. Specifically, the channel and spatial attention allows the network to better focus on where the key targets are in the image. This paper embeds spatial attention and channel attention in the underlying network architecture to better represent image features. To further explore the differences between channels and spatial attention, we propose channel attention bilinear pooling (CAB), spatial attention bilinear pooling (SAB), channel spatial attention bilinear pooling (CSAB), and spatial channel attention bilinear pooling (SCAB) as four alternative frames. A variety of experiments on several datasets show that our proposed method has a very impressive performance compared to other methods based on bilinear pooling.
Subject
Physics and Astronomy (miscellaneous),General Mathematics,Chemistry (miscellaneous),Computer Science (miscellaneous)
Reference39 articles.
1. Selective Search for Object Recognition
2. Overfeat: Integrated recognition, localization and detection using convolutional networks;Sermanet;arXiv,2013
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献