Multi-scale relation reasoning for multi-modal Visual Question Answering-Reference-Cited by-同舟云学术

Multi-scale relation reasoning for multi-modal Visual Question Answering

Published:2021-08 Issue: Volume:96 Page:116319
ISSN:0923-5965
Container-title:Signal Processing: Image Communication
language:en
Short-container-title:Signal Processing: Image Communication

Author:

Wu Yirui,Ma Yuntao,Wan Shaohua

Funder

National Key Research and Development Program of China Stem Cell and Translational Research

Publisher

Elsevier BV

Subject

Electrical and Electronic Engineering,Computer Vision and Pattern Recognition,Signal Processing,Software

Reference42 articles.

1. R. Hu, A. Rohrbach, T. Darrell, K. Saenko, Language-conditioned graph networks for relational reasoning, in: Proceedings of International Conference on Computer Vision, 2019, pp. 10293–10302.

2. Visual question answering model based on visual relationship detection;Xi;Signal Process., Image Commun.,2020

3. L. Xu, H. Huang, J. Liu, SUTD-TrafficQA: A question answering benchmark and an efficient network for video reasoning over traffic events, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021.

4. Exploring deep learning for view-based 3D model retrieval;Gao;ACM Trans. Multim. Comput. Commun. Appl.,2020

5. Y. Goyal, T. Khot, D. Summers-Stay, D. Batra, D. Parikh, Making the V in VQA matter: Elevating the role of image understanding in visual question answering, in: Proceedings of Computer Vision and Pattern Recognition, 2017, pp. 6325–6334.

Cited by 39 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Advancing Vietnamese Visual Question Answering with Transformer and Convolutional Integration;Computers and Electrical Engineering;2024-10

2. Situational Data Integration in Question Answering systems: a survey over two decades;Knowledge and Information Systems;2024-06-18

3. Integrating multimodal features by a two-way co-attention mechanism for visual question answering;Multimedia Tools and Applications;2023-12-29

4. Localize the Copy-Move Forged Region of an Image Using Improved SIFT;SN Computer Science;2023-12-07

5. Counting in Visual Question Answering: Methods, Datasets, and Future Work;International Journal of Image and Graphics;2023-10-20