1. On the opportunities and risks of foundation models;Bommasani,2021
2. Language models are few-shot learners;Brown;Adv. Neural Inf. Process. Syst.,2020
3. Gpt-4 technical report;Achiam,2023
4. Learning transferable visual models from natural language supervision;Radford,2021
5. A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A.C. Berg, W.-Y. Lo, et al., Segment anything, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 4015–4026.