Affiliation:
1. School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
2. Southwest Institute of Technical Physics, Chengdu 610041, China
3. School of Software Engineering, Chengdu University of Information Technology, Chengdu 610225, China
4. Sichuan Province Engineering Technology Research Center of Support Software of Informatization Application, Chengdu 610225, China
Abstract
As a crucial computer vision task, multi-objective semantic segmentation has attracted widespread attention and research in the field of remote sensing image analysis. This technology has important application value in fields such as land resource surveys, global change monitoring, urban planning, and environmental monitoring. However, multi-target semantic segmentation of remote sensing images faces challenges such as complex surface features, complex spectral features, and a wide spatial range, resulting in differences in spatial and spectral dimensions among target features. To fully exploit and utilize spectral feature information, focusing on the information contained in spatial and spectral dimensions of multi-spectral images, and integrating external information, this paper constructs the CD-MQANet network structure, where C represents the Channel Creator module and D represents the Dual-Path Encoder. The Channel Creator module (CCM) mainly includes two parts: a generator block and a spectral attention module. The generator block aims to generate spectral channels that can expand different ground target types, while the spectral attention module can enhance useful spectral information. Dual-Path Encoders include channel encoders and spatial encoders, intended to fully utilize spectrally enhanced images while maintaining the spatial information of the original feature map. The decoder of CD-MQANet is a multitasking decoder composed of four types of attention, enhancing decoding capabilities. The loss function used in the CD-MQANet consists of three parts, which are generated by the intermediate results of the CCM, the intermediate results of the decoder, and the final segmentation results and label calculation. We performed experiments on the Potsdam dataset and the Vaihingen dataset. Compared to the baseline MQANet model, the CD-MQANet network improved mean F1 and OA by 2.03% and 2.49%, respectively, on the Potsdam dataset, and improved mean F1 and OA by 1.42% and 1.25%, respectively, on the Vaihingen dataset. The effectiveness of CD-MQANet was also proven by comparative experiments with other studies. We also conducted a thermographic analysis of the attention mechanism used in CD-MQANet and analyzed the intermediate results generated by CCM and LAM. Both modules generated intermediate results that had a significant positive impact on segmentation.
Funder
Key Projects from the Ministry of Science and Technology of China
Sichuan Science and Technology Program
Fengyun Satellite Application Advance Plan
Sichuan Natural Science Foundation Project
Subject
General Earth and Planetary Sciences
Reference39 articles.
1. Land Use Land Cover Changes and its Impacts on Water Resources in Nile Delta Region Using Remote Sensing Techniques;Elhag;Environ. Dev. Sustain.,2013
2. Zamari, M. (2023). A Proposal for a Wildfire Digital Twin Framework through Automatic Extraction of Remotely Sensed Data: The Italian Case Study of the Susa Valley. [Master’s Thesis, Politecnico di Torino].
3. Karamoutsou, L., and Psilovikos, A. (2021). Deep Learning in Water Resources Management: The Case Study of Kastoria Lake in Greece. Water, 13.
4. Ronneberger, O., Fischer, P., and Brox, T. (2015). International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer.
5. SegNet: A deep convolutional encoder-decoder architecture for image segmentation;Badrinarayanan;IEEE Trans. Pattern Anal. Mach. Intell.,2015