Affiliation:
1. School of Surveying and Mapping, PLA Strategic Support Force Information Engineering University, Zhengzhou 450001, China
2. School of Geoscience and Technology, Zhengzhou University, Zhengzhou 450001, China
Abstract
Synthetic aperture radar (SAR) and optical images provide highly complementary ground information. The fusion of SAR and optical data can significantly enhance semantic segmentation inference results. However, the fusion methods for multimodal data remains a challenge for current research due to significant disparities in imaging mechanisms from diverse sources. Our goal was to bridge the significant gaps between optical and SAR images by developing a dual-input model that utilizes image-level fusion. To improve most existing state-of-the-art image fusion methods, which often assign equal weights to multiple modalities, we employed the principal component analysis (PCA) transform approach. Subsequently, we performed feature-level fusion on shallow feature maps, which retain rich geometric information. We also incorporated a channel attention module to highlight channels rich in features and suppress irrelevant information. This step is crucial due to the substantial similarity between SAR and optical images in shallow layers such as geometric features. In summary, we propose a generic multimodal fusion strategy that can be attached to most encoding–decoding structures for feature classification tasks, designed with two inputs. One input is the optical image, and the other is the three-band fusion data obtained by combining the PCA component of the optical image with the SAR. Our feature-level fusion method effectively integrates multimodal data. The efficiency of our approach was validated using various public datasets, and the results showed significant improvements when applied to several land cover classification models.
Funder
National Natural Science Foundation of China
Reference38 articles.
1. DML: Differ-Modality Learning for Building Semantic Segmentation;Xia;IEEE Trans. Geosci. Remote Sens.,2022
2. Peng, B., Zhang, W., Hu, Y., Chu, Q., and Li, Q. (2022). LRFFNet: Large Receptive Field Feature Fusion Network for Semantic Segmentation of SAR Images in Building Areas. Remote Sens., 14.
3. CroFuseNet: A Semantic Segmentation Network for Urban Impervious Surface Extraction Based on Cross Fusion of Optical and SAR Images;Wu;IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.,2023
4. CFNet: A Cross Fusion Network for Joint Land Cover Classification Using Optical and SAR Images;Kang;IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.,2022
5. Xu, H., He, M., Rao, Z., and Li, W. (2021, January 19–22). Him-Net: A New Neural Network Approach for SAR and Optical Image Template Matching1. Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献