Optimization of Remote-Sensing Image-Segmentation Decoder Based on Multi-Dilation and Large-Kernel Convolution-Reference-Cited by-同舟云学术

Optimization of Remote-Sensing Image-Segmentation Decoder Based on Multi-Dilation and Large-Kernel Convolution

Published:2024-08-03 Issue:15 Volume:16 Page:2851
ISSN:2072-4292
Container-title:Remote Sensing
language:en
Short-container-title:Remote Sensing

Author:

Liu Guohong¹,Liu Cong²,Wu Xianyun¹²³^ORCID,Li Yunsong¹,Zhang Xiao³,Xu Junjie¹

Affiliation:

1. State Key Laboratory of Integrated Service Networks, Xidian University, Xi’an 710071, China

2. Guangzhou Institute of Technology, Xidian University, Guangzhou 510555, China

3. Hangzhou Institute of Technology, Xidian University, Hangzhou 311231, China

Abstract

Land-cover segmentation, a fundamental task within the domain of remote sensing, boasts a broad spectrum of application potential. We address the challenges in land-cover segmentation of remote-sensing imagery and complete the following work. Firstly, to tackle the issues of foreground–background imbalance and scale variation, a module based on multi-dilated rate convolution fusion was integrated into a decoder. This module extended the receptive field through multi-dilated convolution, enhancing the model’s capability to capture global features. Secondly, to address the diversity of scenes and background interference, a hybrid attention module based on large-kernel convolution was employed to improve the performance of the decoder. This module, based on a combination of spatial and channel attention mechanisms, enhanced the extraction of contextual information through large-kernel convolution. A convolution kernel selection mechanism was also introduced to dynamically select the convolution kernel of the appropriate receptive field, suppress irrelevant background information, and improve segmentation accuracy. Ablation studies on the Vaihingen and Potsdam datasets demonstrate that our decoder significantly outperforms the baseline in terms of mean intersection over union and mean F1 score, achieving an increase of up to 1.73% and 1.17%, respectively, compared with the baseline. In quantitative comparisons, the accuracy of our improved decoder also surpasses other algorithms in the majority of categories. The results of this paper indicate that our improved decoder achieves significant performance improvement compared with the old decoder in remote-sensing image-segmentation tasks, which verifies its application potential in the field of land-cover segmentation.

Funder

China Postdoctoral Science Foundation

National Nature Science Foundation of China

the 111 Project

Shaanxi Provincial Science and Technology Innovation Team

the Fundamental Research Funds for the Central Universities

the Youth Innovation Team of Shaanxi Universities

Publisher

MDPI AG

Link

https://www.mdpi.com/2072-4292/16/15/2851/pdf

Reference35 articles.

1. A Review of Deep Learning Methods for Semantic Segmentation of Remote Sensing Imagery;Yuan;Expert Syst. Appl.,2021

2. Deep Convolutional Neural Network for Semantic Image Segmentation;Qing;J. Image Graph.,2020

3. Development Course of Forestry Remote Sensing in China;Zengyuan;Natl. Remote Sens. Bull.,2021

4. ResUNet-a: A Deep Learning Framework for Semantic Segmentation of Remotely Sensed Data;Diakogiannis;ISPRS J. Photogramm. Remote Sens.,2020

5. Huo, Y., Gang, S., and Guan, C. (2023). Fcihmrt: Feature Cross-Layer Interaction Hybrid Method Based on Res2net and Transformer for Remote Sensing Scene Classification. Electronics, 12.