1. Radford, A.; Kim, J.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning transferable visual models from natural language supervision. In: Proceedings of the 38th International Conference on Machine Learning, 8748–8763, 2021.
2. Patashnik, O.; Wu, Z.; Shechtman, E.; Cohen-Or, D.; Lischinski, D. StyleCLIP: Text-driven manipulation of StyleGAN imagery. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2065–2074, 2021.
3. Gal, R.; Patashnik, O.; Maron, H.; Bermano, A. H.; Chechik, G.; Cohen-Or, D. StyleGAN-NADA: CLIP-guided domain adaptation of image generators. ACM Transactions on Graphics Vol. 41, No. 4, Article No. 141, 2022.
4. Frans, K.; Soros, L.; Witkowski, O. CLIPDraw: Exploring text-to-drawing synthesis through language-image encoders. In: Proceedings of the 36th Conference on Neural Information Processing System, 5207–5218, 2022.
5. Wang, C.; Chai, M.; He, M.; Chen, D.; Liao, J. CLIP-NeRF: Text-and-image driven manipulation of neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3835–3844, 2022.