Object detection based on an adaptive attention mechanism-Reference-Cited by-同舟云学术

Object detection based on an adaptive attention mechanism

Published:2020-07-09 Issue:1 Volume:10 Page:
ISSN:2045-2322
Container-title:Scientific Reports
language:en
Short-container-title:Sci Rep

Author:

Li Wei^ORCID,Liu Kai,Zhang Lizhe^ORCID,Cheng Fei

Abstract

AbstractObject detection is an important component of computer vision. Most of the recent successful object detection methods are based on convolutional neural networks (CNNs). To improve the performance of these networks, researchers have designed many different architectures. They found that the CNN performance benefits from carefully increasing the depth and width of their structures with respect to the spatial dimension. Some researchers have exploited the cardinality dimension. Others have found that skip and dense connections were also of benefit to performance. Recently, attention mechanisms on the channel dimension have gained popularity with researchers. Global average pooling is used in SENet to generate the input feature vector of the channel-wise attention unit. In this work, we argue that channel-wise attention can benefit from both global average pooling and global max pooling. We designed three novel attention units, namely, an adaptive channel-wise attention unit, an adaptive spatial-wise attention unit and an adaptive domain attention unit, to improve the performance of a CNN. Instead of concatenating the output of the two attention vectors generated by the two channel-wise attention sub-units, we weight the two attention vectors based on the output data of the two channel-wise attention sub-units. We integrated the proposed mechanism with the YOLOv3 and MobileNetv2 framework and tested the proposed network on the KITTI and Pascal VOC datasets. The experimental results show that YOLOv3 with the proposed attention mechanism outperforms the original YOLOv3 by mAP values of 2.9 and 1.2% on the KITTI and Pascal VOC datasets, respectively. MobileNetv2 with the proposed attention mechanism outperforms the original MobileNetv2 by a mAP value of 1.7% on the Pascal VOC dataset.

Publisher

Springer Science and Business Media LLC

Subject

Multidisciplinary

Link

http://www.nature.com/articles/s41598-020-67529-x.pdf

Reference32 articles.

1. Karen, S., & Andrew, Z. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:1409.1556. (2014)

2. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015).

3. Krizhevsky, A., Sutskever, I., & Hinton G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25: 84–90 (2012).

4. Huang, G., Liu, Z., Laurens, V. D. M. & Weinberger, K. Q. Densely connected convolutional networks. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2017).

5. Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2017).

Cited by 75 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multi-attention network with redundant information filtering for multi-horizon forecasting in multivariate time series;Expert Systems with Applications;2024-12

2. Neural Network Classification Algorithm Based on Self-attention Mechanism and Ensemble Learning for MASLD Ultrasound Images;Ultrasound in Medicine & Biology;2024-09

3. Bridging minds and machines in Industry 5.0: neurobiological approach;Frontiers in Human Neuroscience;2024-08-27

4. A lightweight vehicle detection network fusing feature pyramid and channel attention;Internet of Things;2024-07

5. LCAS-DetNet: A Ship Target Detection Network for Synthetic Aperture Radar Images;Applied Sciences;2024-06-20