Semantic representation for visual reasoning-Reference-Cited by-同舟云学术

Semantic representation for visual reasoning

Published:2019 Issue: Volume:277 Page:02006
ISSN:2261-236X
Container-title:MATEC Web of Conferences
language:
Short-container-title:MATEC Web Conf.

Author:

Ni Xubin,Yin Lirong,Chen Xiaobing,Liu Shan,Yang Bo,Zheng Wenfeng

Abstract

In the field of visual reasoning, image features are widely used as the input of neural networks to get answers. However, image features are too redundant to learn accurate characterizations for regular networks. While in human reasoning, abstract description is usually constructed to avoid irrelevant details. Inspired by this, a higher-level representation named semantic representation is introduced in this paper to make visual reasoning more efficient. The idea of the Gram matrix used in the neural style transfer research is transferred here to build a relation matrix which enables the related information between objects to be better represented. The model using semantic representation as input outperforms the same model using image features as input which verifies that more accurate results can be obtained through the introduction of high-level semantic representation in the field of visual reasoning.

Publisher

EDP Sciences

Subject

General Medicine

Link

https://www.matec-conferences.org/10.1051/matecconf/201927702006/pdf

Reference15 articles.

1. Kafle K. and Kanan C.. Answer-Type Prediction for Visual Question Answering. in Computer Vision and Pattern Recognition. (2016).

2. Malinowski M. and Fritz M.. A multi-world approach to question answering about real-world scenes based on uncertain input. in Advances in Neural Information Processing Systems. (2014).

3. Johnson J., et al., CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning. (2017).

4. Zhou B., et al., Simple Baseline for Visual Question Answering. Computer Science, (2015).

5. Andreas J., et al. Neural Module Networks. in IEEE Conference on Computer Vision and Pattern Recognition. (2016).

Cited by 28 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multiscale Feature Extraction and Fusion of Image and Text in VQA;International Journal of Computational Intelligence Systems;2023-04-11

2. Iterative reconstruction of low-dose CT based on differential sparse;Biomedical Signal Processing and Control;2023-01

3. Increasing Text Filtering Accuracy with Improved LSTM;Computing and Informatics;2023

4. User OCEAN Personality Model Construction Method Using a BP Neural Network;Electronics;2022-09-23

5. Improved Image Fusion Method Based on Sparse Decomposition;Electronics;2022-07-26