1. Flamingo: A visual language model for few-shot learning;Alayrac,2022
2. Palm 2 Technical Report;Anil,2023
3. VQA: Visual question answering;Antol,2015
4. Foundational models defining a new era in vision: A survey and outlook;Awais,2023
5. Language models for human-robot interaction;Billing,2023