A Transformer-Based Model for Super-Resolution of Anime Image-Reference-Cited by-同舟云学术

A Transformer-Based Model for Super-Resolution of Anime Image

Published:2022-10-24 Issue:21 Volume:22 Page:8126
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Xu Shizhuo^ORCID,Dutta Vibekananda^ORCID,He Xin^ORCID,Matsumaru Takafumi^ORCID

Abstract

Image super-resolution (ISR) technology aims to enhance resolution and improve image quality. It is widely applied to various real-world applications related to image processing, especially in medical images, while relatively little appliedto anime image production. Furthermore, contemporary ISR tools are often based on convolutional neural networks (CNNs), while few methods attempt to use transformers that perform well in other advanced vision tasks. We propose a so-called anime image super-resolution (AISR) method based on the Swin Transformer in this work. The work was carried out in several stages. First, a shallow feature extraction approach was employed to facilitate the features map of the input image’s low-frequency information, which mainly approximates the distribution of detailed information in a spatial structure (shallow feature). Next, we applied deep feature extraction to extract the image semantic information (deep feature). Finally, the image reconstruction method combines shallow and deep features to upsample the feature size and performs sub-pixel convolution to obtain many feature map channels. The novelty of the proposal is the enhancement of the low-frequency information using a Gaussian filter and the introduction of different window sizes to replace the patch merging operations in the Swin Transformer. A high-quality anime dataset was constructed to curb the effects of the model robustness on the online regime. We trained our model on this dataset and tested the model quality. We implement anime image super-resolution tasks at different magnifications (2×, 4×, 8×). The results were compared numerically and graphically with those delivered by conventional convolutional neural network-based and transformer-based methods. We demonstrate the experiments numerically using standard peak signal-to-noise ratio (PSNR) and structural similarity (SSIM), respectively. The series of experiments and ablation study showcase that our proposal outperforms others.

Funder

JSPS KAKENHI

Waseda University

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/22/21/8126/pdf

Reference46 articles.

1. Japan Pop! Inside the World of Japanese Popular Culture. Edited by Timothy J. Craig. Armonk, N.Y.: M.E. Sharpe Inc., 2000. ix, 360 pp. $64.95;Kelsky;J. Asian Stud.,2001

2. Napier, S.J. Anime from Akira to Howl’s Moving Castle: Experiencing Contemporary Japanese Animation, 2016.

3. Miss Dai. 2022.

4. Cubic convolution interpolation for digital image processing;Keys;IEEE Trans. Acoust. Speech Signal Process.,1981

5. Improving resolution by image registration;Irani;CVGIP Graph. Model. Image Process.,1991

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. DAE2GAN: Image super-resolution for remote sensing based on an improved edge-enhanced generative adversarial network with double-end attention mechanism;Journal of Applied Remote Sensing;2024-03-12

2. Automatic Recognition and Quantification Feeding Behaviors of Nursery Pigs Using Improved YOLOV5 and Feeding Functional Area Proposals;Animals;2024-02-08

3. A Zero-Shot Super-Resolution Image Reconstruction Technique Based on Radial Basis Function Neural Networks;Proceedings of the 2023 11th International Conference on Computer and Communications Management;2023-08-04

4. Multi-Label Classification in Anime Illustrations Based on Hierarchical Attribute Relationships;Sensors;2023-05-16