A Decoder Structure Guided CNN‐Transformer Network for face super‐resolution

Author:

Dou Rui12,Li Jiawen1,Wan Xujie1,Chang Heyou3,Zheng Hao3,Gao Guangwei12ORCID

Affiliation:

1. Institute of Advanced Technology Nanjing University of Posts and Telecommunications Nanjing China

2. Provincial Key Laboratory for Computer Information Processing Technology Soochow University Suzhou China

3. Key Laboratory of Intelligent Information Processing Nanjing Xiaozhuang University Nanjing China

Abstract

AbstractRecent advances in deep convolutional neural networks have shown improved performance in face super‐resolution through joint training with other tasks such as face analysis and landmark prediction. However, these methods have certain limitations. One major limitation is the requirement for manual marking information on the dataset for multi‐task joint learning. This additional marking process increases the computational cost of the network model. Additionally, since prior information is often estimated from low‐quality faces, the obtained guidance information tends to be inaccurate. To address these challenges, a novel Decoder Structure Guided CNN‐Transformer Network (DCTNet) is introduced, which utilises the newly proposed Global‐Local Feature Extraction Unit (GLFEU) for effective embedding. Specifically, the proposed GLFEU is composed of an attention branch and a Transformer branch, to simultaneously restore global facial structure and local texture details. Additionally, a Multi‐Stage Feature Fusion Module is incorporated to fuse features from different network stages, further improving the quality of the restored face images. Compared with previous methods, DCTNet improves Peak Signal‐to‐Noise Ratio by 0.23 and 0.19 dB on the CelebA and Helen datasets, respectively. Experimental results demonstrate that the designed DCTNet offers a simple yet powerful solution to recover detailed facial structures from low‐quality images.

Funder

National Natural Science Foundation of China

Six Talent Peaks Project in Jiangsu Province

Publisher

Institution of Engineering and Technology (IET)

Subject

Computer Vision and Pattern Recognition,Software

Reference46 articles.

1. Sctanet: a spatial attention‐guided cnn‐transformer aggregation network for deep face image super‐resolution;Bao Q.;IEEE Trans. Multimed.,2023

2. An Efficient Latent Style Guided Transformer-CNN Framework for Face Super-Resolution

3. Semi-Cycled Generative Adversarial Networks for Real-World Face Super-Resolution

4. FCSR-GAN: Joint Face Completion and Super-Resolution via Multi-Task Learning

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3