Global domain adaptation attention with data-dependent regulator for scene segmentation-Reference-Cited by-同舟云学术

Global domain adaptation attention with data-dependent regulator for scene segmentation

Published:2024-02-14 Issue:2 Volume:19 Page:e0295263
ISSN:1932-6203
Container-title:PLOS ONE
language:en
Short-container-title:PLoS ONE

Author:

Lei Qiuyuan^ORCID,Lu Fei

Abstract

Most semantic segmentation works have obtained accurate segmentation results through exploring the contextual dependencies. However, there are several major limitations that need further investigation. For example, most approaches rarely distinguish different types of contextual dependencies, which may pollute the scene understanding. Moreover, local convolutions are commonly used in deep learning models to learn attention and capture local patterns in the data. These convolutions operate on a small neighborhood of the input, focusing on nearby information and disregarding global structural patterns. To address these concerns, we propose a Global Domain Adaptation Attention with Data-Dependent Regulator (GDAAR) method to explore the contextual dependencies. Specifically, to effectively capture both the global distribution information and local appearance details, we suggest using a stacked relation approach. This involves incorporating the feature node itself and its pairwise affinities with all other feature nodes within the network, arranged in raster scan order. By doing so, we can learn a global domain adaptation attention mechanism. Meanwhile, to improve the features similarity belonging to the same segment region while keeping the discriminative power of features belonging to different segments, we design a data-dependent regulator to adjust the global domain adaptation attention on the feature map during inference. Extensive ablation studies demonstrate that our GDAAR better captures the global distribution information for the contextual dependencies and achieves the state-of-the-art performance on several popular benchmarks.

Publisher

Public Library of Science (PLoS)

Reference50 articles.

1. Wang P, Chen P, Yuan Y, Liu D, Huang Z, Hou X, et al. Understanding Convolution for Semantic Segmentation. In: 2018 IEEE Winter Conference on Applications of Computer Vision, WACV 2018, Lake Tahoe, NV, USA, March 12-15, 2018. IEEE Computer Society; 2018. p. 1451–1460.

2. Yang M, Yu K, Zhang C, Li Z, Yang K. DenseASPP for Semantic Segmentation in Street Scenes. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018. Computer Vision Foundation / IEEE Computer Society; 2018. p. 3684–3692.

3. Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015—18th International Conference Munich, Germany, October 5—9, 2015, Proceedings, Part III. vol. 9351 of Lecture Notes in Computer Science. Springer; 2015. p. 234–241.

4. Zhao H, Shi J, Qi X, Wang X, Jia J. Pyramid Scene Parsing Network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017. IEEE Computer Society; 2017. p. 6230–6239.

5. Zhao H, Zhang Y, Liu S, Shi J, Loy CC, Lin D, et al. PSANet: Point-wise Spatial Attention Network for Scene Parsing. In: Computer Vision—ECCV 2018—15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part IX. vol. 11213 of Lecture Notes in Computer Science. Springer; 2018. p. 270–286.