Affiliation:
1. College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China
2. School of Electrical, Electronic, and Computer Engineering, The University of Western Australia, Perth 6009, Australia
Abstract
Remote sensing (RS) images play an indispensable role in many key fields such as environmental monitoring, precision agriculture, and urban resource management. Traditional deep convolutional neural networks have the problem of limited receptive fields. To address this problem, this paper introduces a hybrid network model that combines the advantages of CNN and Transformer, called MBT-UNet. First, a multi-branch encoder design based on the pyramid vision transformer (PVT) is proposed to effectively capture multi-scale feature information; second, an efficient feature fusion module (FFM) is proposed to optimize the collaboration and integration of features at different scales; finally, in the decoder stage, a multi-scale upsampling module (MSUM) is proposed to further refine the segmentation results and enhance segmentation accuracy. We conduct experiments on the ISPRS Vaihingen dataset, the Potsdam dataset, the LoveDA dataset, and the UAVid dataset. Experimental results show that MBT-UNet surpasses state-of-the-art algorithms in key performance indicators, confirming its superior performance in high-precision remote sensing image segmentation tasks.
Funder
Natural Science Foundation of Heilongjiang Province
Fundamental Strengthening Program Technical Field Fund
Reference85 articles.
1. Wetland Change Analysis in Alberta, Canada Using Four Decades of Landsat Imagery;Amani;IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.,2021
2. Xu, C., Wang, J., Sang, Y., Li, K., Liu, J., and Yang, G. (2023). An Effective Deep Learning Model for Monitoring Mangroves: A Case Study of the Indus Delta. Remote Sens., 15.
3. Boundary Enhancement Semantic Segmentation for Building Extraction From Remote Sensed Image;Jung;IEEE Trans. Geosci. Remote Sens.,2022
4. Building Extraction With Vision Transformer;Wang;IEEE Trans. Geosci. Remote Sens.,2022
5. MAENet: Multiple Attention Encoder–Decoder Network for Farmland Segmentation of Remote Sensing Images;Huan;IEEE Geosci. Remote Sens. Lett.,2022