SRBPSwin: Single-Image Super-Resolution for Remote Sensing Images Using a Global Residual Multi-Attention Hybrid Back-Projection Network Based on the Swin Transformer-Reference-Cited by-同舟云学术

SRBPSwin: Single-Image Super-Resolution for Remote Sensing Images Using a Global Residual Multi-Attention Hybrid Back-Projection Network Based on the Swin Transformer

Published:2024-06-20 Issue:12 Volume:16 Page:2252
ISSN:2072-4292
Container-title:Remote Sensing
language:en
Short-container-title:Remote Sensing

Author:

Qin Yi¹²^ORCID,Wang Jiarong¹^ORCID,Cao Shenyi³,Zhu Ming¹,Sun Jiaqi¹²,Hao Zhicheng¹,Jiang Xin¹^ORCID

Affiliation:

1. Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China

2. University of Chinese Academy of Sciences, Beijing 100049, China

3. Faculty of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, China

Abstract

Remote sensing images usually contain abundant targets and complex information distributions. Consequently, networks are required to model both global and local information in the super-resolution (SR) reconstruction of remote sensing images. The existing SR reconstruction algorithms generally focus on only local or global features, neglecting effective feedback for reconstruction errors. Therefore, a Global Residual Multi-attention Fusion Back-projection Network (SRBPSwin) is introduced by combining the back-projection mechanism with the Swin Transformer. We incorporate a concatenated Channel and Spatial Attention Block (CSAB) into the Swin Transformer Block (STB) to design a Multi-attention Hybrid Swin Transformer Block (MAHSTB). SRBPSwin develops dense back-projection units to provide bidirectional feedback for reconstruction errors, enhancing the network’s feature extraction capabilities and improving reconstruction performance. SRBPSwin consists of the following four main stages: shallow feature extraction, shallow feature refinement, dense back projection, and image reconstruction. Firstly, for the input low-resolution (LR) image, shallow features are extracted and refined through the shallow feature extraction and shallow feature refinement stages. Secondly, multiple up-projection and down-projection units are designed to alternately process features between high-resolution (HR) and LR spaces, obtaining more accurate and detailed feature representations. Finally, global residual connections are utilized to transfer shallow features during the image reconstruction stage. We propose a perceptual loss function based on the Swin Transformer to enhance the detail of the reconstructed image. Extensive experiments demonstrate the significant reconstruction advantages of SRBPSwin in quantitative evaluation and visual quality.

Funder

Science and Technology Department of Jilin Province of China

Science and Technology project of Jilin Provincial Education Department of China

Publisher

MDPI AG

Link

https://www.mdpi.com/2072-4292/16/12/2252/pdf

Reference56 articles.

1. Wang, Z., Yi, J., Guo, J., Song, Y., Lyu, J., Xu, J., Yan, W., Zhao, J., Cai, Q., and Min, H. (2022). A Review of Image Super-Resolution Approaches Based on Deep Learning and Applications in Remote Sensing. Remote Sens., 14.

2. Liu, C., Zhang, S., Hu, M., and Song, Q. (2024). Object Detection in Remote Sensing Images Based on Adaptive Multi-Scale Feature Fusion Method. Remote Sens., 16.

3. Remote Sensing Scene Classification Based on Multibranch Fusion Network;Shi;IEEE Geosci. Remote Sens. Lett.,2023

4. Chen, X., Li, D., Liu, M., and Jia, J. (2023). CNN and Transformer Fusion for Remote Sensing Image Semantic Segmentation. Remote Sens., 15.

5. Huang, L., An, R., Zhao, S., and Jiang, T. (2020). A Deep Learning-Based Robust Change Detection Approach for Very High Resolution Remotely Sensed Images with Multiple Features. Remote Sens., 12.