1. Flamingo: a visual language model for few-shot learning;Alayrac Jean-Baptiste;Advances in Neural Information Processing Systems,2022
2. Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization. 65--72.
3. Yunkai Chen, Qimeng Wang, Shiwei Wu, Yan Gao, Tong Xu, and Yao Hu. 2024. TOMGPT: Reliable Text-Only Training Approach for Cost-Effective Multi-modal Large Language Model. ACM Transactions on Knowledge Discovery from Data (2024).
4. EmoSen: Generating Sentiment and Emotion Controlled Responses in a Multimodal Dialogue System