Story-to-Images Translation: Leveraging Diffusion Models and Large Language Models for Sequence Image Generation-Reference-Cited by-同舟云学术

Story-to-Images Translation: Leveraging Diffusion Models and Large Language Models for Sequence Image Generation

Published:2023-10-29 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the 2nd Workshop on User-centric Narrative Summarization of Long Videos
language:
Short-container-title:

Author:

Kumagai Haruka¹^ORCID,Yamaki Ryosuke²^ORCID,Naganuma Hiroki³^ORCID

Affiliation:

1. The University of Tokyo, Tokyo, Japan

2. Ritsumeikan University & ProPlace Inc., Kyoto, Japan

3. Université de Montréal & ProPlace Inc., Montréal, PQ, Canada

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3607540.3617144

Reference28 articles.

1. Ekin Akyürek , Dale Schuurmans , Jacob Andreas , Tengyu Ma , and Denny Zhou . 2022. What learning algorithm is in-context learning? investigations with linear models. arXiv preprint arXiv:2211.15661 ( 2022 ). Ekin Akyürek, Dale Schuurmans, Jacob Andreas, Tengyu Ma, and Denny Zhou. 2022. What learning algorithm is in-context learning? investigations with linear models. arXiv preprint arXiv:2211.15661 (2022).

2. Wenhu Chen , Hexiang Hu , Chitwan Saharia , and William W Cohen . 2022 . Re-imagen: Retrieval-augmented text-to-image generator. arXiv preprint arXiv:2209.14491 (2022). Wenhu Chen, Hexiang Hu, Chitwan Saharia, and William W Cohen. 2022. Re-imagen: Retrieval-augmented text-to-image generator. arXiv preprint arXiv:2209.14491 (2022).

3. Colin Conwell and Tomer Ullman . 2022. Testing relational understanding in text-guided image generation. arXiv preprint arXiv:2208.00005 ( 2022 ). Colin Conwell and Tomer Ullman. 2022. Testing relational understanding in text-guided image generation. arXiv preprint arXiv:2208.00005 (2022).

4. Rinon Gal , Yuval Alaluf , Yuval Atzmon , Or Patashnik , Amit H Bermano , Gal Chechik , and Daniel Cohen-Or . 2022. An image is worth one word: Personalizing text-to-image generation using textual inversion. arXiv preprint arXiv:2208.01618 ( 2022 ). Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H Bermano, Gal Chechik, and Daniel Cohen-Or. 2022. An image is worth one word: Personalizing text-to-image generation using textual inversion. arXiv preprint arXiv:2208.01618 (2022).

5. Generative adversarial networks