Enhancing Baidu Multimodal Advertisement with Chinese Text-to-Image Generation via Bilingual Alignment and Caption Synthesis-Reference-Cited by-同舟云学术

Enhancing Baidu Multimodal Advertisement with Chinese Text-to-Image Generation via Bilingual Alignment and Caption Synthesis

Published:2024-07-10 Issue: Volume: Page:2855-2859
ISSN:
Container-title:Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
language:
Short-container-title:

Author:

Zhao Kang¹^ORCID,Zhao Xinyu¹^ORCID,Jin Zhipeng¹^ORCID,Yang Yi¹^ORCID,Tao Wen¹^ORCID,Han Cong¹^ORCID,Li Shuanglong¹^ORCID,Liu Lin¹^ORCID

Affiliation:

1. Baidu Search Ads, Baidu Inc., Beijing, China

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3626772.3661350

Reference32 articles.

1. Peter Anderson, Basura Fernando, Mark Johnson, and Stephen Gould. 2016. SPICE: Semantic Propositional Image Caption Evaluation. ArXiv abs/1607.08822 (2016).

2. Martín Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein Generative Adversarial Networks. In International Conference on Machine Learning.

3. Jinze Bai, Shuai Bai, Shusheng Yang, Shijie Wang, Sinan Tan, Peng Wang, Junyang Lin, Chang Zhou, and Jingren Zhou. 2023. Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities. ArXiv abs/2308.12966 (2023).

4. Fan Bao, Shen Nie, Kaiwen Xue, Yue Cao, Chongxuan Li, Hang Su, and Jun Zhu. 2022. All are Worth Words: A ViT Backbone for Diffusion Models. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022), 22669--22679.

5. James Betker Gabriel Goh Li Jing ? Tim Brooks Jianfeng Wang Linjie Li ? LongOuyang ? Juntang Zhuang ? Joyce Lee ? Yufei Guo ? Wesam Manassra ? Prafulla Dhariwal ? Casey Chu ? Yunxin Jiao and Aditya Ramesh. [n. d.]. Improving Image Generation with Better Captions.