Author:
Park Pilseo,Jang Soojin,Cho Yunsung,Kim Youngbin
Publisher
Springer Science and Business Media LLC
Subject
Computer Networks and Communications,Hardware and Architecture,Media Technology,Software
Reference58 articles.
1. Ba J, Kiros JR, Hinton GE (2016) Layer normalization. ArXiv arXiv:1607.06450
2. Bi B, Li C, Wu C, et al (2020) Palm: pre-training an autoencodingautoregressive language model for context-conditioned generation. In: Conference on empirical methods in natural language processing
3. Chen S, Jin Q, Wang P et al (2020) Say as you wish: fine-grained control of image caption generation with abstract scene graphs. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020:9959–9968
4. Chen T, Tian R, Ding Z (2021) Visual reasoning using graph convolutional networks for predicting pedestrian crossing intention. IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) 2021:3096–3102
5. Chen YC, Li L, Yu L, et al (2019) Uniter: learning universal image-text representations. ArXiv arXiv:1909.11740
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Fast retrieval of multi-modal embeddings for e-commerce applications;International Journal of Knowledge-based and Intelligent Engineering Systems;2024-05-16