Multi-Attention Multi-Image Super-Resolution Transformer (MAST) for Remote Sensing
-
Published:2023-08-25
Issue:17
Volume:15
Page:4183
-
ISSN:2072-4292
-
Container-title:Remote Sensing
-
language:en
-
Short-container-title:Remote Sensing
Author:
Li Jiaao123, Lv Qunbo123, Zhang Wenjian123, Zhu Baoyu123, Zhang Guiyu123, Tan Zheng123
Affiliation:
1. Aerospace Information Research Institute, Chinese Academy of Sciences, No.9 Dengzhuang South Road, Haidian District, Beijing 100094, China 2. School of Optoelectronics, University of Chinese Academy of Sciences, No.19(A) Yuquan Road, Shijingshan District, Beijing 100049, China 3. Department of Key Laboratory of Computational Optical Imagine Technology, CAS, No.9 Dengzhuang South Road, Haidian District, Beijing 100094, China
Abstract
Deep-learning-driven multi-image super-resolution (MISR) reconstruction techniques have significant application value in the field of aerospace remote sensing. In particular, Transformer-based models have shown outstanding performance in super-resolution tasks. However, current MISR models have some deficiencies in the application of multi-scale information and the modeling of the attention mechanism, leading to an insufficient utilization of complementary information in multiple images. In this context, we innovatively propose a Multi-Attention Multi-Image Super-Resolution Transformer (MAST), which involves improvements in two main aspects. Firstly, we present a Multi-Scale and Mixed Attention Block (MMAB). With its multi-scale structure, the network is able to extract image features from different scales to obtain more contextual information. Additionally, the introduction of mixed attention allows the network to fully explore high-frequency features of the images in both channel and spatial dimensions. Secondly, we propose a Collaborative Attention Fusion Block (CAFB). By incorporating channel attention into the self-attention layer of the Transformer, we aim to better establish global correlations between multiple images. To improve the network’s perception ability of local detailed features, we introduce a Residual Local Attention Block (RLAB). With the aforementioned improvements, our model can better extract and utilize non-redundant information, achieving a superior restoration effect that balances the global structure and local details of the image. The results from the comparative experiments reveal that our approach demonstrated a notable enhancement in cPSNR, with improvements of 0.91 dB and 0.81 dB observed in the NIR and RED bands of the PROBA-V dataset, respectively, in comparison to the existing state-of-the-art methods. Extensive experiments demonstrate that the method proposed in this paper can provide a valuable reference for solving multi-image super-resolution tasks for remote sensing.
Funder
Key Program Project of Science and Technology Innovation of the Chinese Academy of Sciences Innovation Foundation of the Key Laboratory of Computational Optical Imaging Technology, CAS
Subject
General Earth and Planetary Sciences
Reference66 articles.
1. Hussain, S., Lu, L., Mubeen, M., Nasim, W., Karuppannan, S., Fahad, S., Tariq, A., Mousa, B., Mumtaz, F., and Aslam, M. (2022). Spatiotemporal variation in land use land cover in the response to local climate change using multispectral remote sensing data. Land, 11. 2. Image deconvolution for optical small satellite with deep learning and real-time GPU acceleration;Ngo;J. Real-Time Image Process.,2021 3. Wang, X., Yi, J., Guo, J., Song, Y., Lyu, J., Xu, J., Yan, W., Zhao, J., Cai, Q., and Min, H. (2022). A review of image super-resolution approaches based on deep learning and applications in remote sensing. Remote Sens., 14. 4. Jo, Y., Oh, S.W., Vajda, P., and Kim, S.J. (2021, January 20–25). Tackling the ill-posedness of super-resolution through adaptive target generation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA. 5. Diffraction and Resolving Power;Harris;J. Opt. Soc. Am.,1964
|
|