Abstract
Abstract. Self-supervised learning has great potential for the remote sensing domain, where unlabelled observations are abundant, but labels are hard to obtain. This work leverages unlabelled multi-modal remote sensing data for augmentation-free contrastive self-supervised learning. Deep neural network models are trained to maximize the similarity of latent representations obtained with different sensing techniques from the same location, while distinguishing them from other locations. We showcase this idea with two self-supervised data fusion methods and compare against standard supervised and self-supervised learning approaches on a land-cover classification task. Our results show that contrastive data fusion is a powerful self-supervised technique to train image encoders that are capable of producing meaningful representations: Simple linear probing performs on par with fully supervised approaches and fine-tuning with as little as 10% of the labelled data results in higher accuracy than supervised training on the entire dataset.
Cited by
15 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Multi-Modal Diffusion for Self-Supervised Pretraining;IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium;2024-07-07
2. Boundary-Aware Adversarial Learning Domain Adaption and Active Learning for Cross-Sensor Building Extraction;IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing;2024
3. Land-Cover Classification with Self-Supervised ResNet50 (Integrating Plantation Data);2023 International Workshop on Artificial Intelligence and Image Processing (IWAIIP);2023-12-01
4. SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding;2023 IEEE/CVF International Conference on Computer Vision (ICCV);2023-10-01
5. Joint Multi-Modal Self-Supervised Pre-Training in Remote Sensing: Application to Methane Source Classification;IGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium;2023-07-16