Exploring Human-Like Attention Supervision in Visual Question Answering-Reference-Cited by-同舟云学术

Exploring Human-Like Attention Supervision in Visual Question Answering

Published:2018-04-27 Issue:1 Volume:32 Page:
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Qiao Tingting,Dong Jianfeng,Xu Duanqing

Abstract

Attention mechanisms have been widely applied in the Visual Question Answering (VQA) task, as they help to focus on the area-of-interest of both visual and textual information. To answer the questions correctly, the model needs to selectively target different areas of an image, which suggests that an attention-based model may benefit from an explicit attention supervision. In this work, we aim to address the problem of adding attention supervision to VQA models. Since there is a lack of human attention data, we first propose a Human Attention Network (HAN) to generate human-like attention maps, training on a recently released dataset called Human ATtention Dataset (VQA-HAT). Then, we apply the pre-trained HAN on the VQA v2.0 dataset to automatically produce the human-like attention maps for all image-question pairs. The generated human-like attention map dataset for the VQA v2.0 dataset is named as Human-Like ATtention (HLAT) dataset. Finally, we apply human-like attention supervision to an attention-based VQA model. The experiments show that adding human-like supervision yields a more accurate attention together with a better performance, showing a promising future for human-like attention supervision in VQA.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 33 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Overcoming language priors in visual question answering with cumulative learning strategy;Neurocomputing;2024-12

2. Gradient-Based Instance-Specific Visual Explanations for Object Specification and Object Discrimination;IEEE Transactions on Pattern Analysis and Machine Intelligence;2024-09

3. Global explanation supervision for Graph Neural Networks;Frontiers in Big Data;2024-07-01

4. A review on devices and learning techniques in domestic intelligent environment;Journal of Ambient Intelligence and Humanized Computing;2024-03-13

5. IMCN: Improved modular co-attention networks for visual question answering;Applied Intelligence;2024-03