CD-MQANet: Enhancing Multi-Objective Semantic Segmentation of Remote Sensing Images through Channel Creation and Dual-Path Encoding

Author:

Zhang Jinglin1,Li Yuxia1,Zhang Bowei2,He Lei34ORCID,He Yuan2,Deng Wantao2,Si Yu1,Tong Zhonggui1,Gong Yushu1,Liao Kunwei3

Affiliation:

1. School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China

2. Southwest Institute of Technical Physics, Chengdu 610041, China

3. School of Software Engineering, Chengdu University of Information Technology, Chengdu 610225, China

4. Sichuan Province Engineering Technology Research Center of Support Software of Informatization Application, Chengdu 610225, China

Abstract

As a crucial computer vision task, multi-objective semantic segmentation has attracted widespread attention and research in the field of remote sensing image analysis. This technology has important application value in fields such as land resource surveys, global change monitoring, urban planning, and environmental monitoring. However, multi-target semantic segmentation of remote sensing images faces challenges such as complex surface features, complex spectral features, and a wide spatial range, resulting in differences in spatial and spectral dimensions among target features. To fully exploit and utilize spectral feature information, focusing on the information contained in spatial and spectral dimensions of multi-spectral images, and integrating external information, this paper constructs the CD-MQANet network structure, where C represents the Channel Creator module and D represents the Dual-Path Encoder. The Channel Creator module (CCM) mainly includes two parts: a generator block and a spectral attention module. The generator block aims to generate spectral channels that can expand different ground target types, while the spectral attention module can enhance useful spectral information. Dual-Path Encoders include channel encoders and spatial encoders, intended to fully utilize spectrally enhanced images while maintaining the spatial information of the original feature map. The decoder of CD-MQANet is a multitasking decoder composed of four types of attention, enhancing decoding capabilities. The loss function used in the CD-MQANet consists of three parts, which are generated by the intermediate results of the CCM, the intermediate results of the decoder, and the final segmentation results and label calculation. We performed experiments on the Potsdam dataset and the Vaihingen dataset. Compared to the baseline MQANet model, the CD-MQANet network improved mean F1 and OA by 2.03% and 2.49%, respectively, on the Potsdam dataset, and improved mean F1 and OA by 1.42% and 1.25%, respectively, on the Vaihingen dataset. The effectiveness of CD-MQANet was also proven by comparative experiments with other studies. We also conducted a thermographic analysis of the attention mechanism used in CD-MQANet and analyzed the intermediate results generated by CCM and LAM. Both modules generated intermediate results that had a significant positive impact on segmentation.

Funder

Key Projects from the Ministry of Science and Technology of China

Sichuan Science and Technology Program

Fengyun Satellite Application Advance Plan

Sichuan Natural Science Foundation Project

Publisher

MDPI AG

Subject

General Earth and Planetary Sciences

Reference39 articles.

1. Land Use Land Cover Changes and its Impacts on Water Resources in Nile Delta Region Using Remote Sensing Techniques;Elhag;Environ. Dev. Sustain.,2013

2. Zamari, M. (2023). A Proposal for a Wildfire Digital Twin Framework through Automatic Extraction of Remotely Sensed Data: The Italian Case Study of the Susa Valley. [Master’s Thesis, Politecnico di Torino].

3. Karamoutsou, L., and Psilovikos, A. (2021). Deep Learning in Water Resources Management: The Case Study of Kastoria Lake in Greece. Water, 13.

4. Ronneberger, O., Fischer, P., and Brox, T. (2015). International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer.

5. SegNet: A deep convolutional encoder-decoder architecture for image segmentation;Badrinarayanan;IEEE Trans. Pattern Anal. Mach. Intell.,2015

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3