1. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models;Li,2023
2. Minigpt-4: Enhancing vision-language understanding with advanced large language models;Zhu,2023
3. Visual Instruction
4. Scaling instruction-finetuned language models;Chung,2022
5. Judging llm-as-a-judge with mt-bench and chatbot arena;Zheng,2023