Affiliation:
1. School of Computer Science and Technology, Xinjiang University, Ürümqi 830046, China
Abstract
Semantic segmentation is currently a hot topic in remote sensing image processing. There are extensive applications in land planning and surveying. Many current studies combine Convolutional Neural Networks (CNNs), which extract local information, with Transformers, which capture global information, to obtain richer information. However, the fused feature information is not sufficiently enriched and it often lacks detailed refinement. To address this issue, we propose a novel method called the Multi-View Feature Fusion and Rich Information Refinement Network (MFRNet). Our model is equipped with the Multi-View Feature Fusion Block (MAFF) to merge various types of information, including local, non-local, channel, and positional information. Within MAFF, we introduce two innovative methods. The Sliding Heterogeneous Multi-Head Attention (SHMA) extracts local, non-local, and positional information using a sliding window, while the Multi-Scale Hierarchical Compressed Channel Attention (MSCA) leverages bar-shaped pooling kernels and stepwise compression to obtain reliable channel information. Additionally, we introduce the Efficient Feature Refinement Module (EFRM), which enhances segmentation accuracy by interacting the results of the Long-Range Information Perception Branch and the Local Semantic Information Perception Branch. We evaluate our model on the ISPRS Vaihingen and Potsdam datasets. We conducted extensive comparison experiments with state-of-the-art models and verified that MFRNet outperforms other models.
Funder
the Scientiffc and Technological Innovation 2030 Major Project
the Basic Research Funds for Colleges and Universities in Xinjiang Uygur Autonomous Region
the Key Laboratory Open Projects in Xinjiang Uygur Autonomous Region
the Graduate Research and Innovation Project of Xinjiang Uygur Autonomous Region
Reference42 articles.
1. A review of deep learning methods for semantic segmentation of remote sensing imagery;Yuan;Expert Syst. Appl.,2021
2. Zhang, Z., Liu, F., Liu, C., Tian, Q., and Qu, H. (2023). ACTNet: A dual-attention adapter with a CNN-transformer network for the semantic segmentation of remote sensing imagery. Remote Sens., 15.
3. Lithological mapping of geological remote sensing via adversarial semi-supervised segmentation network;Wang;Int. J. Appl. Earth Obs. Geoinf.,2023
4. Yuan, M., Ren, D., Feng, Q., Wang, Z., Dong, Y., Lu, F., and Wu, X. (2023). MCAFNet: A multiscale channel attention fusion network for semantic segmentation of remote sensing images. Remote Sens., 15.
5. Chen, J., Sahli, H., Chen, J., Wang, C., He, D., and Yue, A. (2016, January 10–15). A hybrid land-use mapping approach based on multi-scale spatial context. Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China.