Unified Multimodal Model with Unlikelihood Training for Visual Dialog-Reference-Cited by-同舟云学术

Unified Multimodal Model with Unlikelihood Training for Visual Dialog

Published:2022-10-10 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the 30th ACM International Conference on Multimedia
language:
Short-container-title:

Author:

Wang Zihao¹,Wang Junli¹,Jiang Changjun¹

Affiliation:

1. Tongji University, Shanghai, China

Funder

The National Key Research and Development Program of China

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3503161.3547974

Reference43 articles.

1. History for Visual Dialog: Do we really need it?

2. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

3. VQA: Visual Question Answering

4. Multimodal Machine Learning: A Survey and Taxonomy

5. Hangbo Bao , Li Dong , Furu Wei , Wenhui Wang , Nan Yang , Xiaodong Liu , Yu Wang , Jianfeng Gao , Songhao Piao , Ming Zhou , and Hsiao-Wuen Hon . 2020 . UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training . In Proceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research , Vol. 119), , Hal Daumé III and Aarti Singh (Eds.). PMLR, 642--652. https://proceedings.mlr.press/v119/bao20a.html Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Jianfeng Gao, Songhao Piao, Ming Zhou, and Hsiao-Wuen Hon. 2020. UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training. In Proceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 119), , Hal Daumé III and Aarti Singh (Eds.). PMLR, 642--652. https://proceedings.mlr.press/v119/bao20a.html

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. SDGIN: Structure-aware dual-level graph interactive network with semantic roles for visual dialog;Knowledge-Based Systems;2024-02

2. VD-GR: Boosting Visual Dialog with Cascaded Spatial-Temporal Multi-Modal GRaphs;2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV);2024-01-03

3. Modeling Intra- and Inter-Modal Alignment with Optimal Transport for Visual Dialog;2023 IEEE 35th International Conference on Tools with Artificial Intelligence (ICTAI);2023-11-06

4. Learning from Easy to Hard Pairs: Multi-step Reasoning Network for Human-Object Interaction Detection;Proceedings of the 31st ACM International Conference on Multimedia;2023-10-26