Efficient image restoration with style-guided context cluster and interaction-Reference-Cited by-同舟云学术

Efficient image restoration with style-guided context cluster and interaction

Published:2024-02-17 Issue:13 Volume:36 Page:6973-6991
ISSN:0941-0643
Container-title:Neural Computing and Applications
language:en
Short-container-title:Neural Comput & Applic

Author:

Qiao Fengjuan,Zhu Yonggui,Meng Ming

Abstract

AbstractRecently, convolutional neural networks (CNNs) and vision transformers (ViTs) have emerged as powerful tools for image restoration (IR). Nonetheless, they encountered some limitations due to their characteristics, such as CNNs sacrificing global reception and ViTs requiring large memory and graphics resources. To address these limitations and explore an alternative approach for improved IR performance, we propose two clustering-based frameworks for general IR tasks, which are style-guided context cluster U-Net (SCoC-UNet) and style-guided clustered point interaction U-Net (SCPI-UNet). The SCoC-UNet adopts a U-shaped architecture, comprising position embedding, Encoder, Decoder, and reconstruction block. Specifically, the input low-quality image is viewed as a set of unorganized points, each of which is first given location information by the continuous relative position embedding method. These points are then fed into a symmetric Encoder and Decoder which utilize style-guided context cluster (SCoC) blocks to extract potential context features and high-frequency information. Although SCoC-UNet has obtained decent performance for image restoration, its SCoC block can only capture connectivity at points within the same cluster, which may ignore long-range dependencies in different clusters. To address this issue, we further propose a SCPI-UNet based on SCoC-UNet, which leverages a style-guided clustered point interaction (SCPI) block in place of the SCoC block. The SCPI block utilizes a cross-attention mechanism to establish the connections of feature points between different clusters. Extensive experimental results demonstrate that the proposed SCoC-UNet and SCPI-UNet can handle several typical IR tasks (i.e., JPEG compression artifact reduction, image denoising, and super-resolution) and achieve superior quantitative and qualitative performance over some state-of-the-art methods.

Funder

State Key Laboratory of Virtual Reality Technology and Systems

Fundamental Research Funds for the Central Universities

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s00521-024-09440-4.pdf

Reference75 articles.

1. Zhang K, Zuo W, Gu S, Zhang L (2017) Learning deep cnn denoiser prior for image restoration. In: Proceedings of the conference on computer vision and pattern recognition, pp. 3929–3938

2. Tai Y, Yang J, Liu X (2017) Image super-resolution via deep recursive residual network. In: Proceedings of the conference on computer vision and pattern recognition, pp. 3147–3155

3. Niu B, Wen W, Ren W, Zhang X, Yang L, Wang S, Zhang K, Cao X, Shen H (2020) Single image super-resolution via a holistic attention network. In: Proceedings of the European conference on computer vision, pp. 191–207

4. Liang J, Cao J, Sun G, Zhang K, Van Gool L, Timofte R (2021) Swinir: image restoration using swin transformer. In: Proceedings of the international conference on computer vision, pp. 1833–1844

5. Han K, Wang Y, Chen H, Chen X, Guo J, Liu Z, Tang Y, Xiao A, Xu C, Xu Y (2022) A survey on vision transformer. IEEE Trans Pattern Anal Mach Intell 45(1):87–110