1. Ramesh A., et al.: Zero-Shot Text-to-Image Generation. In: International Conference on Machine Learning (ICML), pp. 8821–8831 (2021)
2. Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., Chen, M.: Hierarchical Text-Conditional Image Generation with CLIP Latents. In: arXiv, preprint: arXiv:2204.06125, (2022)
3. Saharia, C., et al.: Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. In: arXiv, preprint: arXiv:2205.11487, (2022)
4. Razavi, A., Van-den-Oord, A., Vinyals, O.: Generating diverse high-fidelity images with VQ-VAE-2. Adv. Neural Inf. Process. Syst. (NeurIPS) 32, 14837–14847 (2019)
5. Lecture Notes in Computer Science;S Biswas,2021