Author:
Zhang Tong,Gao Peng,Dong Hao,Zhuang Yin,Wang Guanqun,Zhang Wei,Chen He
Abstract
Currently, under supervised learning, a model pre-trained by a large-scale nature scene dataset and then fine-tuned on a few specific task labeling data is the paradigm that has dominated knowledge transfer learning. Unfortunately, due to different categories of imaging data and stiff challenges of data annotation, there is not a large enough and uniform remote sensing dataset to support large-scale pre-training in the remote sensing domain (RSD). Moreover, pre-training models on large-scale nature scene datasets by supervised learning and then directly fine-tuning on diverse downstream tasks seems to be a crude method, which is easily affected by inevitable incorrect labeling, severe domain gaps and task-aware discrepancies. Thus, in this paper, considering the self-supervised pre-training and powerful vision transformer (ViT) architecture, a concise and effective knowledge transfer learning strategy called ConSecutive Pre-Training (CSPT) is proposed based on the idea of not stopping pre-training in natural language processing (NLP), which can gradually bridge the domain gap and transfer large-scale data knowledge to any specific domain (e.g., from nature scene domain to RSD) In addition, the proposed CSPT also can release the huge potential of unlabeled data for task-aware model training. Finally, extensive experiments were carried out on twelve remote sensing datasets involving three types of downstream tasks (e.g., scene classification, object detection and land cover classification) and two types of imaging data (e.g., optical and synthetic aperture radar (SAR)). The results show that by utilizing the proposed CSPT for task-aware model training, almost all downstream tasks in the RSD can outperform the previous knowledge transfer learning strategies based on model pre-training without any expensive manually labeling and even surpass the state-of-the-art (SOTA) performance without any careful network architecture designing.
Funder
Chang Jiang Scholars Program
Civil Aviation Program
Space-based on orbit real-time processing technology program
National Science Foundation for Young Scientists of China
National Natural Science Foundation of China
Subject
General Earth and Planetary Sciences
Reference97 articles.
1. Seismic vulnerability assessment at urban scale using data mining and GIScience technology: Application to Urumqi (China);Liu;Geomat. Nat. Hazards Risk,2019
2. Urban planning and building smart cities based on the Internet of Things using Big Data analytics;Rathore;Comput. Netw. Int. J. Comput. Telecommun. Netw.,2016
3. Ozdarici-Ok, A., Ok, A.O., and Schindler, K. Mapping of Agricultural Crops from Single High-Resolution Multispectral Images—Data-Driven Smoothing vs. Parcel-Based Smoothing. Remote Sens., 2015. 7.
4. Real-time object detection in agricultural/remote environments using the multiple-expert colour feature extreme learning machine (MEC-ELM);Sadgrove;Comput. Ind.,2018
5. Detection and tracking of large number of targets in wide area surveillance;Daniilidis;Computer Vision—ECCV 2010. ECCV 2010,2010
Cited by
16 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献