1. A versatile test set to evaluate the out-of-distribution generalization of vqa models;Binaryvqa;arXiv preprint, arXiv:2301.12032,2023
2. Webqa: Multihop and multimodal qa;Chang,2022
3. Clevr: A diagnostic dataset for compositional language and elementary visual reasoning;Johnson,2017
4. Mdetr-modulated detection for end-to-end multi-modal understanding;Kamath,2021
5. Selective question answering under domain shift;Kamath;arXiv preprint arXiv:2006.09462,2020