TextControlGAN: Text-to-Image Synthesis with Controllable Generative Adversarial Networks-Reference-Cited by-同舟云学术

TextControlGAN: Text-to-Image Synthesis with Controllable Generative Adversarial Networks

Published:2023-04-19 Issue:8 Volume:13 Page:5098
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Ku Hyeeun¹,Lee Minhyeok¹^ORCID

Affiliation:

1. School of Electrical and Electronics Engineering, Chung-Ang University, Seoul 06974, Republic of Korea

Abstract

Generative adversarial networks (GANs) have demonstrated remarkable potential in the realm of text-to-image synthesis. Nevertheless, conventional GANs employing conditional latent space interpolation and manifold interpolation (GAN-CLS-INT) encounter challenges in generating images that accurately reflect the given text descriptions. To overcome these limitations, we introduce TextControlGAN, a controllable GAN-based model specifically designed for text-to-image synthesis tasks. In contrast to traditional GANs, TextControlGAN incorporates a neural network structure, known as a regressor, to effectively learn features from conditional texts. To further enhance the learning performance of the regressor, data augmentation techniques are employed. As a result, the generator within TextControlGAN can learn conditional texts more effectively, leading to the production of images that more closely adhere to the textual conditions. Furthermore, by concentrating the discriminator’s training efforts on GAN training exclusively, the overall quality of the generated images is significantly improved. Evaluations conducted on the Caltech-UCSD Birds-200 (CUB) dataset demonstrate that TextControlGAN surpasses the performance of the cGAN-based GAN-INT-CLS model, achieving a 17.6% improvement in Inception Score (IS) and a 36.6% reduction in Fréchet Inception Distance (FID). In supplementary experiments utilizing 128 × 128 resolution images, TextControlGAN exhibits a remarkable ability to manipulate minor features of the generated bird images according to the given text descriptions. These findings highlight the potential of TextControlGAN as a powerful tool for generating high-quality, text-conditioned images, paving the way for future advancements in the field of text-to-image synthesis.

Funder

National Research Foundation of Korea

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/13/8/5098/pdf

Reference53 articles.

1. Samek, W., Wiegand, T., and Müller, K.-R. (2017). Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models. arXiv.

2. Lee, Y.-L., Tsung, P.-K., and Wu, M. (2018, January 16–19). Techology trend of edge ai. Proceedings of the 2018 International Symposium on VLSI Design, Automation and Test (VLSI-DAT), Hsinchu, Taiwan.

3. Ongsulee, P. (2017, January 22–24). Artificial intelligence, machine learning and deep learning. Proceedings of the 2017 15th International Conference on ICT and Knowledge Engineering (ICT&KE), Bangkok, Thailand.

4. Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., and Frey, B. (2015). Adversarial autoencoders. arXiv.

5. Mescheder, L., Nowozin, S., and Geiger, A. (2017, January 6–11). Adversarial variational bayes: Unifying variational autoencoders and generative adversarial networks. Proceedings of the International Conference on Machine Learning, Sydney, Australia.

Cited by 20 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Advancements in adversarial generative text-to-image models: a review;The Imaging Science Journal;2024-09-04

2. An improved StyleGAN-based TextToFace model with Local-Global information Fusion;Expert Systems with Applications;2024-09

3. Generative artificial intelligence: a systematic review and applications;Multimedia Tools and Applications;2024-08-14

4. A Technological Framework to Support Asthma Patient Adherence Using Pictograms;Applied Sciences;2024-07-23

5. Enhancing Control in Stable Diffusion Through Example-based Fine-Tuning and Prompt Engineering;2024 5th International Conference on Image Processing and Capsule Networks (ICIPCN);2024-07-03