SFA-Net: Semantic Feature Adjustment Network for Remote Sensing Image Segmentation
-
Published:2024-09-03
Issue:17
Volume:16
Page:3278
-
ISSN:2072-4292
-
Container-title:Remote Sensing
-
language:en
-
Short-container-title:Remote Sensing
Author:
Hwang Gyutae1ORCID, Jeong Jiwoo1ORCID, Lee Sang Jun2ORCID
Affiliation:
1. Division of Electronic Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea 2. Future Semiconductor Convergence Technology Research Center, Division of Electronic Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea
Abstract
Advances in deep learning and computer vision techniques have made impacts in the field of remote sensing, enabling efficient data analysis for applications such as land cover classification and change detection. Convolutional neural networks (CNNs) and transformer architectures have been utilized in visual perception algorithms due to their effectiveness in analyzing local features and global context. In this paper, we propose a hybrid transformer architecture that consists of a CNN-based encoder and transformer-based decoder. We propose a feature adjustment module that refines the multiscale feature maps extracted from an EfficientNet backbone network. The adjusted feature maps are integrated into the transformer-based decoder to perform the semantic segmentation of the remote sensing images. This paper refers to the proposed encoder–decoder architecture as a semantic feature adjustment network (SFA-Net). To demonstrate the effectiveness of the SFA-Net, experiments were thoroughly conducted with four public benchmark datasets, including the UAVid, ISPRS Potsdam, ISPRS Vaihingen, and LoveDA datasets. The proposed model achieved state-of-the-art accuracy on the UAVid, ISPRS Vaihingen, and LoveDA datasets for the segmentation of the remote sensing images. On the ISPRS Potsdam dataset, our method achieved comparable accuracy to the latest model while reducing the number of trainable parameters from 113.8 M to 10.7 M.
Funder
the Korea government
Reference38 articles.
1. Daniilidis, K., Maragos, P., and Paragios, N. (2010). Learning to Detect Roads in High-Resolution Aerial Images. Computer Vision–ECCV 2010, Proceedings of the 11th European Conference on Computer Vision, Heraklion, Crete, Greece, 5–11 September 2010, Springer. 2. Demir, I., Koperski, K., Lindenbaum, D., Pang, G., Huang, J., Basu, S., Hughes, F., Tuia, D., and Raskar, R. (2018, January 18–23). DeepGlobe 2018: A Challenge to Parse the Earth Through Satellite Images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Salt Lake City, UT, USA. 3. Mohajerani, S., and Saeedi, P. (August, January 28). Cloud-Net: An End-to-End Cloud Detection Algorithm for Landsat 8 Imagery. Proceedings of the IGARSS 2019—2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan. 4. Maji, S., Rahtu, E., Kannala, J., Blaschko, M., and Vedaldi, A. (2013). Fine-Grained Visual Classification of Aircraft. arXiv. 5. Chen, H., and Shi, Z. (2020). A Spatial-Temporal Attention-Based Method and a New Dataset for Remote Sensing Image Change Detection. Remote Sens., 12.
|
|