Affiliation:
1. The School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
2. The S-Lab, Nanyang Technological University, Singapore 639798, Singapore
Abstract
Self-supervised learning (SSL) has significantly bridged the gap between supervised and unsupervised learning in computer vision tasks and shown impressive success in the field of remote sensing (RS). However, these methods have primarily focused on single-modal RS data, which may have limitations in capturing the diversity of information in complex scenes. In this paper, we propose the Asymmetric Attention Fusion (AAF) framework to explore the potential of multi-modal representation learning compared to two simpler fusion methods: early fusion and late fusion. Given that data from active sensors (e.g., digital surface models and light detection and ranging) is often noisier and less informative than optical images, the AAF is designed with an asymmetric attention mechanism within a two-stream encoder, applied at each encoder stage. Additionally, we introduce a Transfer Gate module to select more informative features from the fused representations, enhancing performance in downstream tasks. Our comparative analyses on the ISPRS Potsdam datasets, focusing on scene classification and segmentation tasks, demonstrate significant performance enhancements with AAF compared to baseline methods. The proposed approach achieves an improvement of over 7% in all metrics compared to randomly initialized methods for both tasks. Furthermore, when compared to early fusion and late fusion methods, AAF consistently outperforms in achieving superior improvements. These results underscore the effectiveness of AAF in leveraging the strengths of multi-modal RS data for SSL, opening doors for more sophisticated and nuanced RS analysis.
Funder
National Natural Science Foundation of China
Subject
General Earth and Planetary Sciences
Reference50 articles.
1. Remote sensing in urban planning: Contributions towards ecologically sound policies?;Wellmann;Landsc. Urban Plan.,2020
2. SAR and optical remote sensing: Assessment of complementarity and interoperability in the context of a large-scale operational forest monitoring system;Lehmann;Remote Sens. Environ.,2015
3. Agricultural remote sensing big data: Management and applications;Huang;J. Integr. Agric.,2018
4. Schumann, G.J., Brakenridge, G.R., Kettner, A.J., Kashif, R., and Niebuhr, E. (2018). Assisting flood disaster response with earth observation data and products: A critical assessment. Remote Sens., 10.
5. Tackling climate change with machine learning;Rolnick;ACM Comput. Surv. (CSUR),2022
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Transformer-Based Incomplete Multi-Modal Learning for Land Cover Classification;IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium;2024-07-07