Image embedding for denoising generative models-Reference-Cited by-同舟云学术

Image embedding for denoising generative models

Published:2023-06-02 Issue:12 Volume:56 Page:14511-14533
ISSN:0269-2821
Container-title:Artificial Intelligence Review
language:en
Short-container-title:Artif Intell Rev

Author:

Asperti Andrea,Evangelista Davide,Marro Samuele,Merizzi Fabio

Abstract

AbstractDenoising Diffusion models are gaining increasing popularity in the field of generative modeling for several reasons, including the simple and stable training, the excellent generative quality, and the solid probabilistic foundation. In this article, we address the problem of embedding an image into the latent space of Denoising Diffusion Models, that is finding a suitable “noisy” image whose denoising results in the original image. We particularly focus on Denoising Diffusion Implicit Models due to the deterministic nature of their reverse diffusion process. As a side result of our investigation, we gain a deeper insight into the structure of the latent space of diffusion models, opening interesting perspectives on its exploration, the definition of semantic trajectories, and the manipulation/conditioning of encodings for editing purposes. A particularly interesting property highlighted by our research, which is also characteristic of this class of generative models, is the independence of the latent representation from the networks implementing the reverse diffusion process. In other words, a common seed passed to different networks (each trained on the same dataset), eventually results in identical images.

Funder

Alma Mater Studiorum - Università di Bologna

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Linguistics and Language,Language and Linguistics

Link

https://link.springer.com/content/pdf/10.1007/s10462-023-10504-5.pdf

Reference41 articles.

1. Abdal R, Qin Y, Wonka P (2019) Image2stylegan: How to embed images into the stylegan latent space? In: 2019 IEEE/CVF international conference on computer vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, IEEE, pp 4431–4440. https://doi.org/10.1109/ICCV.2019.00453

2. Abdal R, Qin Y, Wonka P (2020) Image2stylegan++: How to edit the embedded images? In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8296–8305

3. Alaluf Y, Tov O, Mokady R, Gal R, Bermano A (2022) Hyperstyle: Stylegan inversion with hypernetworks for real image editing. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 18511–18521