Author:
Dong Feng,Wang Xiaofeng,Oad Ammar,Talpur Mir Sajjad Hussain,Khuhro Mansoor Ahmed,Sun Bowen
Subject
Electrical and Electronic Engineering,General Computer Science,Control and Systems Engineering
Reference44 articles.
1. Multimodal feature fusion by relational reasoning and attention for visual question answering;Zhang;Inf Fus,2020
2. Chen, S., Zhao, Y., Jin, Q., & Wu, Q. (2020) Fine-grained video-text retrieval with hierarchical graph reasoning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10638-10647.
3. Bahdanau, D., Cho, K., & Bengio, Y. (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
4. MRA-Net: Improving VQA via Multi-modal Relation Attention Network;Peng;IEEE Trans Pattern Anal Mach Intell,2020
5. Image caption generation with dual attention mechanism;Liu;Inf Process Manag,2020