Author:
Li Guanxing,Cui Zhaotong,Li Meng,Han Yu,Li Tianping
Abstract
AbstractRecently, Transformer-based methods have gained prominence in image super-resolution (SR) tasks, addressing the challenge of long-range dependence through the incorporation of cross-layer connectivity and local attention mechanisms. However, the analysis of these networks using local attribution maps has revealed significant limitations in leveraging the spatial extent of input information. To unlock the inherent potential of Transformer in image SR, we propose the Multi-Attention Fusion Transformer (MAFT), a novel model designed to integrate multiple attention mechanisms with the objective of expanding the number and range of pixels activated during image reconstruction. This integration enhances the effective utilization of input information space. At the core of our model lies the Multi-attention Adaptive Integration Groups, which facilitate the transition from dense local attention to sparse global attention through the introduction of Local Attention Aggregation and Global Attention Aggregation blocks with alternating connections, effectively broadening the network's receptive field. The effectiveness of our proposed algorithm has been validated through comprehensive quantitative and qualitative evaluation experiments conducted on benchmark datasets. Compared to state-of-the-art methods (e.g. HAT), the proposed MAFT achieves 0.09 dB gains on Urban100 dataset for × 4 SR task while containing 32.55% and 38.01% fewer parameters and FLOPs, respectively.
Funder
National Natural Science Foundation of China
Publisher
Springer Science and Business Media LLC
Reference60 articles.
1. Zhang, Y., Fan, Q., Bao, F., Liu, Y. & Zhang, C. Single-image super-resolution based on rational fractal interpolation. IEEE Trans. Image Process. 27, 3782–3797 (2018).
2. Chang, H., Yeung, D. Y. & Xiong, Y. Super-resolution through neighbor embedding. in Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. vol. 1 (2004).
3. Xiong, Z., Sun, X. & Wu, F. Robust web image/video super-resolution. IEEE Trans. Image Process. 19, 2017–2028 (2010).
4. Freeman, W. T., Jones, T. R. & Pasztor, E. C. Example-based super-resolution. IEEE Comput. Graph. Appl. 22, 56–65 (2002).
5. Freedman, G. & Fattal, R. Image and video upscaling from local self-examples. ACM Trans. Graph. 30, 1–11 (2011).