Regularizing Attention Networks for Anomaly Detection in Visual Question Answering-Reference-Cited by-同舟云学术

Regularizing Attention Networks for Anomaly Detection in Visual Question Answering

Published:2021-05-18 Issue:3 Volume:35 Page:1845-1853
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Lee Doyup,Cheon Yeongjae,Han Wook-Shin

Abstract

For stability and reliability of real-world applications, the robustness of DNNs in unimodal tasks has been evaluated. However, few studies consider abnormal situations that a visual question answering (VQA) model might encounter at test time after deployment in the real-world. In this study, we evaluate the robustness of state-of-the-art VQA models to five different anomalies, including worst-case scenarios, the most frequent scenarios, and the current limitation of VQA models. Different from the results in unimodal tasks, the maximum confidence of answers in VQA models cannot detect anomalous inputs, and post-training of the outputs, such as outlier exposure, is ineffective for VQA models. Thus, we propose an attention-based method, which uses confidence of reasoning between input images and questions and shows much more promising results than the previous methods in unimodal tasks. In addition, we show that a maximum entropy regularization of attention networks can significantly improve the attention-based anomaly detection of the VQA models. Thanks to the simplicity, attention-based anomaly detection and the regularization are model-agnostic methods, which can be used for various cross-modal attentions in the state-of-the-art VQA models. The results imply that cross-modal attention in VQA is important to improve not only VQA accuracy, but also the robustness to various anomalies.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Human–Object Interaction detection via Global Context and Pairwise-level Fusion Features Integration;Neural Networks;2024-02

2. PromptAD: Zero-shot Anomaly Detection using Text Prompts;2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV);2024-01-03

3. Benchmarking Out-of-Distribution Detection in Visual Question Answering;2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV);2024-01-03

4. Rare Category Analysis for Complex Data: A Review;ACM Computing Surveys;2023-11-27

5. Toward Unsupervised Realistic Visual Question Answering;2023 IEEE/CVF International Conference on Computer Vision (ICCV);2023-10-01