Two‐Step Training: Adjustable Sketch Colourization via Reference Image and Text Tag-Reference-Cited by-同舟云学术

Two‐Step Training: Adjustable Sketch Colourization via Reference Image and Text Tag

Published:2023-04-05 Issue:6 Volume:42 Page:
ISSN:0167-7055
Container-title:Computer Graphics Forum
language:en
Short-container-title:Computer Graphics Forum

Author:

Yan Dingkun¹^ORCID,Ito Ryogo¹^ORCID,Moriai Ryo¹^ORCID,Saito Suguru¹^ORCID

Affiliation:

1. Department of Computer Science Tokyo Institute of Technology Meguro‐ku Japan

Abstract

AbstractAutomatic sketch colourization is a highly interestinged topic in the image‐generation field. However, due to the absence of texture in sketch images and the lack of training data, existing reference‐based methods are ineffective in generating visually pleasant results and cannot edit the colours using text tags. Thus, this paper presents a conditional generative adversarial network (cGAN)‐based architecture with a pre‐trained convolutional neural network (CNN), reference‐based channel‐wise attention (RBCA) and self‐adaptive multi‐layer perceptron (MLP) to tackle this problem. We propose two‐step training and spatial latent manipulation to achieve high‐quality and colour‐adjustable results using reference images and text tags. The superiority of our approach in reference‐based colourization is demonstrated through qualitative/quantitative comparisons and user studies with existing network‐based methods. We also validate the controllability of the proposed model and discuss the details of our latent manipulation on the basis of experimental results of multi‐label manipulation.

Publisher

Wiley

Subject

Computer Graphics and Computer-Aided Design

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.14791

Reference59 articles.

1. [AcB21] Anonymous Community D. BranwenG.:Danbooru2020: A large‐scale crowdsourced and tagged anime illustration dataset.https://www.gwern.net/Danbooru2020(2021). Accessed: 2021‐03‐13.

2. Image up-sampling using total-variation regularization with a new observation model

3. Colorization of Line Drawings with Empty Pupils

4. [CLJ*15] ChristianS. LiuW. JiaY. PierreS. ScottR. DragomirA. DumitruE. VanhouckeV. RabinovichA.:Going deeper with convolutions. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition CVPR(2015) IEEE Computer Society pp. 1–9.https://doi.org/10.1109/CVPR.2015.7298594

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. ETBHD‐HMF: A Hierarchical Multimodal Fusion Architecture for Enhanced Text‐Based Hair Design;Computer Graphics Forum;2024-09-03

2. Versatile Vision Foundation Model for Image and Video Colorization;Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers '24;2024-07-13

3. DiffMat: Latent diffusion models for image-guided material generation;Visual Informatics;2024-03

4. Anime Sketch Coloring Based on Self-attention Gate and Progressive PatchGAN;Pattern Recognition and Computer Vision;2023-12-28