Enhanced multi-scale networks for semantic segmentation-Reference-Cited by-同舟云学术

Enhanced multi-scale networks for semantic segmentation

Published:2023-12-04 Issue: Volume: Page:
ISSN:2199-4536
Container-title:Complex & Intelligent Systems
language:en
Short-container-title:Complex Intell. Syst.

Author:

Li Tianping,Cui Zhaotong,Han Yu,Li Guanxing,Li Meng,Wei Dongmei

Abstract

AbstractMulti-scale representation provides an effective answer to the scale variation of objects and entities in semantic segmentation. The ability to capture long-range pixel dependency facilitates semantic segmentation. In addition, semantic segmentation necessitates the effective use of pixel-to-pixel similarity in the channel direction to enhance pixel areas. By reviewing the characteristics of earlier successful segmentation models, we discover a number of crucial elements that enhance segmentation model performance, including a robust encoder structure, multi-scale interactions, attention mechanisms, and a robust decoder structure. The attention mechanism of the asymmetric non-local neural network (ANNet) is merged with multi-scale pyramidal modules to accelerate model segmentation while maintaining high accuracy. However, ANNet does not account for the similarity between pixels in the feature map channel direction, making the segmentation accuracy unsatisfactory. As a result, we propose EMSNet, a straightforward convolutional network architecture for semantic segmentation that consists of Integration of enhanced regional module (IERM) and Multi-scale convolution module (MSCM). The IERM module generates weights using four or five-stage feature maps, then fuses the input features with the weights and uses more computation. The similarity of the channel direction feature graphs is also calculated using ANNet’s auxiliary loss function. The MSCM module can more accurately describe the interactions between various channels, capture the interdependencies between feature pixels, and capture the multi-scale context. Experiments prove that we perform well in tests using the benchmark dataset. On Cityscapes test data, we get 82.2% segmentation accuracy. The mIoU in the ADE20k and Pascal VOC datasets are, respectively, 45.58% and 85.46%.

Funder

National Natural Science Foundation of China

Publisher

Springer Science and Business Media LLC

Subject

Computational Mathematics,Engineering (miscellaneous),Information Systems,Artificial Intelligence

Link

https://link.springer.com/content/pdf/10.1007/s40747-023-01279-x.pdf

Reference56 articles.

1. Zhou B, Zhao H, Puig X, et al (2017) Scene parsing through ADE20K dataset. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, Honolulu, HI, pp 5122–5130

2. Li Y, Guo Y, Kao Y, He R (2016) Image piece learning for weakly supervised semantic segmentation. IEEE Trans Syst Man Cybern Syst 47(4):648–659. https://doi.org/10.1109/TSMC.2016.2623683

3. Gao G, Xu G, Yu Y et al (2021) MSCFNet: a lightweight network with multi-scale context fusion for real-time semantic segmentation. IEEE Trans Intell Transport Syst 23(12):25489–25499. https://doi.org/10.1109/TITS.2021.3098355

4. Teichmann M, Weber M, Zollner M, et al (2018) MultiNet: real-time joint semantic reasoning for autonomous driving. In: 2018 IEEE intelligent vehicles symposium (IV). IEEE, Changshu, pp 1013–1020

5. Siam M, Elkerdawy S, Jagersand M, Yogamani S (2017) Deep semantic segmentation for automated driving: taxonomy, roadmap and challenges. In: 2017 IEEE 20th international conference on intelligent transportation systems (ITSC). IEEE, Yokohama, pp 1–8

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Containment Control-Guided Boundary Information for Semantic Segmentation;Applied Sciences;2024-08-19

2. Nested attention network based on category contexts learning for semantic segmentation;Complex & Intelligent Systems;2024-06-19