Why Not Both? An Attention-Guided Transformer with Pixel-Related Deconvolution Network for Face Super-Resolution-Reference-Cited by-同舟云学术

Why Not Both? An Attention-Guided Transformer with Pixel-Related Deconvolution Network for Face Super-Resolution

Published:2024-04-29 Issue:9 Volume:14 Page:3793
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Zhang Zhe¹,Qi Chun¹

Affiliation:

1. School of Electronics and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, China

Abstract

Transformer-based encoder-decoder networks for face super-resolution (FSR) have achieved promising success in delivering stunningly clear and detailed facial images by capturing local and global dependencies. However, these methods have certain limitations. Specifically, the deconvolution in upsampling layers neglects the relationship between adjacent pixels, which is crucial in facial structure reconstruction. Additionally, raw feature maps are fed to the transformer blocks directly without mining their potential feature information, resulting in suboptimal face images. To circumvent these problems, we propose an attention-guided transformer with pixel-related deconvolution network for FSR. Firstly, we devise a novel Attention-Guided Transformer Module (AGTM), which is composed of an Attention-Guiding Block (AGB) and a Channel-wise Multi-head Transformer Block (CMTB). AGTM at the top of the encoder-decoder network (AGTM-T) promotes both local facial details and global facial structures, while AGTM at the bottleneck side (AGTM-B) optimizes the encoded features. Secondly, a Pixel-Related Deconvolution (PRD) layer is specially designed to establish direct relationships among adjacent pixels in the upsampling process. Lastly, we develop a Multi-scale Feature Fusion Module (MFFM) to fuse multi-scale features for better network flexibility and reconstruction results. Quantitative and qualitative experimental results on various datasets demonstrate that the proposed method outperforms other state-of-the-art FSR methods.

Funder

National Natural Science Foundation of China

Publisher

MDPI AG

Link

https://www.mdpi.com/2076-3417/14/9/3793/pdf

Reference64 articles.

1. Baker, S., and Kanade, T. (2000, January 28–30). Hallucinating faces. Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, Grenoble, France.

2. Deep learning-based face super-resolution: A survey;Jiang;ACM Comput. Surv.,2023

3. An edge-guided image interpolation algorithm via directional filtering and data fusion;Zhang;IEEE Trans. Image Process.,2006

4. Super-resolution of face images using kernel pca-based prior;Chakrabarti;IEEE Trans. Multimed.,2007

5. Position-patch based face hallucination using convex optimization;Jung;IEEE Signal Process. Lett.,2011