1. Open-ended remote sensing visual question answering with transformers;Al Rahhal;Int. J. Remote Sens.,2022
2. Flamingo: A visual language model for few-shot learning;Alayrac;Adv. Neural Inf. Process. Syst.,2022
3. Anderson, P., He, X., Buehler, C., Teney, D., Johnson, M., Gould, S., Zhang, L., 2018. Bottom-up and top-down attention for image captioning and visual question answering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 6077–6086.
4. Antol, S., Agrawal, A., Lu, J., Mitchell, M., Batra, D., Zitnick, C.L., Parikh, D., 2015. Vqa: Visual question answering. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 2425–2433.
5. Visual question generation from remote sensing images;Bashmal;IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.,2023