Abstract
AbstractSemantic image editing allows users to selectively change entire image attributes in a controlled manner with just a few clicks. Most approaches use a generative adversarial network (GAN) for this task to learn an appropriate latent space representation and attribute-specific transformations. Attribute entanglement has been a limiting factor for previous approaches to attribute manipulation. However, more recent approaches have made significant improvements in this regard using separate networks for attribute extraction. Iterative optimization algorithms based on backpropagation can be used to find attribute vectors with minimal entanglement, but this requires large amounts of GPU memory, can lead to training instability, and requires differentiable models. To circumvent these issues, we present a local search-based approach to latent space editing that achieves comparable performance to existing algorithms while avoiding the aforementioned drawbacks. We also introduce a new evaluation metric that is easier to interpret than previous metrics.
Funder
Universität der Bundeswehr München
Publisher
Springer Science and Business Media LLC
Subject
Computer Science Applications,Computer Networks and Communications,Computer Graphics and Computer-Aided Design,Computational Theory and Mathematics,Artificial Intelligence,General Computer Science
Reference57 articles.
1. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27. Curran Associates, Inc., Morehouse Lane Red Hook NY, United States 2014. https://proceedings.neurips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf. Accessed 13 Oct 2023
2. Choi Y, Choi M, Kim M, Ha J-W, Kim S, Choo J. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018.
3. Choi Y, Uh Y, Yoo J, Ha J-W. Stargan v2: Diverse image synthesis for multiple domains. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020.
4. Isola P, Zhu J-Y, Zhou T, Efros AA. Image-to-image translation with conditional adversarial networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017;5967–5976. https://doi.org/10.1109/CVPR.2017.632
5. Lee H-Y, Tseng H-Y, Mao Q, Huang J-B, Lu Y-D, Singh MK, Yang M-H. Drit++: diverse image-to-image translation viadisentangled representations. Int J Comput Vis. 2020;128:2402–17.