Artfusion: A Diffusion Model-Based Style Synthesis Framework for Portraits

Author:

Yang Hyemin1,Yang Heekyung2ORCID,Min Kyungha1

Affiliation:

1. Department of Computer Science, Sangmyung University, Seoul 03016, Republic of Korea

2. Department of Software, Sangmyung University, Cheonan 31066, Republic of Korea

Abstract

We present a diffusion model-based approach that applies the artistic style of an artist or an art movement to a portrait photograph. Learning the style from the artworks of an artist or an art movement requires a training dataset composed of a lot of samples. We resolve this limitation by combining Contrastive Language Image Pretraining (CLIP) encoder and diffusion model, since the CLIP encoder extracts the features from an input portrait in a very effective way. Our framework includes three independent CLIP encoders that extract the text features, color features and Canny edge features from an input portrait, respectively. These features are incorporated to the style information extracted through a diffusion model to complete the stylization on an input portrait. The diffusion model extracts the style information from the sample images in the training dataset using an image encoder. The denoising steps in the diffusion model applies the style information from the training dataset to the CLIP-based features from an input portrait. Finally, our framework produces an artistic portrait that presents both the identity of the input portrait and the artistic style from the training dataset. The most important contribution of our framework is that our framework requires less than a hundred sample images for an artistic style. Therefore, our framework can successfully extract styles from an artist who has drawn less than a hundred artworks. We sample three artists and three art movements and apply these styles to the portraits of various identities and produce visually pleasing results. We evaluate our results using various metrics, including Frechet Inception Distance (FID), ArtFID and Language-Image Quality Evaluator (LIQE) to prove the excellence of our results.

Funder

Ministry of Education

Publisher

MDPI AG

Reference37 articles.

1. Gatys, L.A., Ecker, A.S., and Bethge, M. (July, January 26). Image style transfer using convolutional neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.

2. Huang, X., and Benlongie, S. (2017, January 21–26). Arbitrary style transfer in real-time with adaptive instance Normalization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.

3. An, J., Huang, S., Song, Y., Dou, D., Liu, W., and Luo, J. (2017, January 21–26). ArtFlow: Unbiased image style transfer via reversible neural flows. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.

4. Deng, Y., Tang, F., Dong, W., Sun, W., Huang, F., and Luo, J. (2020, January 12–16). Arbitrary style transfer via multi-adaptation network. Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA.

5. Chen, H., Wang, Z., Zhang, H., Zuo, Z., Li, A., Xing, W., and Lu, D. (2021, January 6–14). Artistic style transfer with internal-externel learning and contrastive learning. Proceedings of the Conference on Neural Information Processing Systems, Online.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3