1. Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
2. Liang Peng , Yang Yang , Zheng Wang , Zi Huang , and Heng Tao Shen . Mra-net : Improving vqa via multi-modal relation attention network . In TPAMI , 2020 . Liang Peng, Yang Yang, Zheng Wang, Zi Huang, and Heng Tao Shen. Mra-net: Improving vqa via multi-modal relation attention network. In TPAMI, 2020.
3. Peter Anderson , Qi Wu , Damien Teney , Jake Bruce , Mark Johnson , Niko Sünderhauf , Ian Reid , Stephen Gould , and Anton Van Den Hengel . Vision-and-language navigation : Interpreting visually-grounded navigation instructions in real environments . In CVPR , 2018 . Peter Anderson, Qi Wu, Damien Teney, Jake Bruce, Mark Johnson, Niko Sünderhauf, Ian Reid, Stephen Gould, and Anton Van Den Hengel. Vision-and-language navigation: Interpreting visually-grounded navigation instructions in real environments. In CVPR, 2018.
4. Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation
5. Comprehensive Feature-Based Robust Video Fingerprinting Using Tensor Model