Author:
Lei Jingsheng,Shu Chente,Xu Qiang,Yu Yunxiang,Yang Shengying
Abstract
AbstractTraditional pyramid pooling modules have shown effective improvements in semantic segmentation tasks by capturing multi-scale feature information. However, their limitations arise from the shallow structure, which fails to fully extract contextual information, and the fused multi-scale feature information lacks distinctiveness, resulting in issues with the final segmentation discriminability. To address these issues, we proposes an effective solution called FCPFNet, which is based on global contextual prior for deep feature extraction of detailed information. Specifically, we introduce a novel deep feature aggregation module to extract semantic information from the output feature map of each layer through a deep aggregation of context information module, and expands the effective perception range. Additionally, we propose an Efficient Pyramid Pooling Module (EPPM) to capture distinctive features through communicating information between different sub-features and performs multi-scale fusion, which is integrated as a branch within the network to complement the information loss resulting from downsampling operations. Furthermore, in order to ensure the richness of image detail feature information and maintain a large receptive field to obtain more contextual information, EPPM concatenates the input feature map and the output feature map of the pyramid pooling module to acquire more comprehensive global contextual information. It has been demonstrated by experiment that the method described in this article achieves competitive performance on the challenging scene segmentation datasets Pascal VOC 2012, Cityscapes and Coco-Stuff, with MIOU of 81.0%, 78.8% and 40.1%, respectively.
Funder
Natural Science Foundation of China
Xinjiang Uygur Autonomous Region
Publisher
Springer Science and Business Media LLC
Reference56 articles.
1. Li M, Chen D, Liu S (2022) Weakly supervised segmentation loss based on graph cuts and superpixel algorithm. Neural Process Lett, pp 1–24
2. Sun W, Liu Z, Zhang Y, et al (2023) An alternative to WSSS? An empirical study of the segment anything model (SAM) on weakly-supervised semantic segmentation problems. arXiv preprint arXiv:2305.01586
3. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
4. Chen LC, Papandreou G, Kokkinos I et al (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
5. Shen D, Ji Y, Li P et al (2020) Ranet: region attention network for semantic segmentation. Adv Neural Inf Process Syst 33:13927–13938