StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story Continuation-Reference-Cited by-同舟云学术

StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story Continuation

Published:2022 Issue: Volume: Page:70-87
ISSN:0302-9743
Container-title:Lecture Notes in Computer Science
language:
Short-container-title:

Author:

Maharana Adyasha,Hannan Darryl,Bansal Mohit

Publisher

Springer Nature Switzerland

Link

https://link.springer.com/content/pdf/10.1007/978-3-031-19836-6_5

Reference48 articles.

1. Borgeaud, S., et al.: Improving language models by retrieving from trillions of tokens. arXiv preprint arXiv:2112.04426 (2021)

2. Changpinyo, S., Sharma, P., Ding, N., Soricut, R.: Conceptual 12M: pushing web-scale image-text pre-training to recognize long-tail visual concepts. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3558–3568 (2021)

3. Cho, J., Zala, A., Bansal, M.: DALL-Eval: probing the reasoning skills and social biases of text-to-image generative transformers. arXiv preprint arXiv:2202.04053 (2022)

4. Esser, P., Rombach, R., Ommer, B.: Taming transformers for high-resolution image synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12873–12883 (2021)

5. Frans, K., Soros, L., Witkowski, O.: CLIPDraw: exploring text-to-drawing synthesis through language-image encoders. arXiv preprint arXiv:2106.14843 (2021)

Cited by 13 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. LeMon: Automating Portrait Generation for Zero-Shot Story Visualization with Multi-Character Interactions;Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2024-08-24

2. The Chosen One: Consistent Characters in Text-to-Image Diffusion Models;Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers '24;2024-07-13

3. Look and Review, Then Tell: Generate More Coherent Paragraphs from Images by Fusing Visual and Textual Information;2024 International Joint Conference on Neural Networks (IJCNN);2024-06-30

4. Open Sesame? Open Salami! Personalizing Vocabulary Assessment-Intervention for Children via Pervasive Profiling and Bespoke Storybook Generation;Proceedings of the CHI Conference on Human Factors in Computing Systems;2024-05-11

5. Causal-Story: Local Causal Attention Utilizing Parameter-Efficient Tuning for Visual Story Synthesis;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14