Survey on low-level controllable image synthesis with deep learning

Author:

Zhang Shixiong1,Li Jiao2,Yang Lu1

Affiliation:

1. School of Automation Engineering, University of Electronic Science and Technology of China, Sichuan, China

2. College of Information Engineering, Sichuan Agricultural University, Sichuan, China

Abstract

<abstract><p>Deep learning, particularly generative models, has inspired controllable image synthesis methods and applications. These approaches aim to generate specific visual content using latent prompts. To explore low-level controllable image synthesis for precise rendering and editing tasks, we present a survey of recent works in this field using deep learning. We begin by discussing data sets and evaluation indicators for low-level controllable image synthesis. Then, we review the state-of-the-art research on geometrically controllable image synthesis, focusing on viewpoint/pose and structure/shape controllability. Additionally, we cover photometrically controllable image synthesis methods for 3D re-lighting studies. While our focus is on algorithms, we also provide a brief overview of related applications, products and resources for practitioners.</p></abstract>

Publisher

American Institute of Mathematical Sciences (AIMS)

Subject

General Mathematics

Reference217 articles.

1. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, B. Ommer, High-resolution image synthesis with latent diffusion models, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2022), 10684–10695.

2. Y. Cao, S. Li, Y. Liu, Z. Yan, Y. Dai, P. S. Yu, et al., A comprehensive survey of AI-generated content (aigc): A history of generative AI from GAN to ChatGPT, preprint, arXiv: 2303.04226.

3. R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. V. Arx, et al., On the opportunities and risks of foundation models, preprint, arXiv: 2108.07258.

4. L. Zhang, A. Rao, M. Agrawala, Adding conditional control to text-to-image diffusion models, in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), IEEE, (2023), 3836–3847.

5. X. Wang, L. Xie, C. Dong, Y. Shan, Real-ESRGAN: Training real-world blind super-resolution with pure synthetic data, in Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), IEEE, (2021), 1905–1914.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3