1. J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, in: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 1(Long and Short Papers) Association for Computational Linguistics, Minneapolis, Minnesota, 2018, pp. 4171—4186.
2. Language models are few-shot learners;Brown,2020
3. Training language models to follow instructions with human feedback;Ouyang,2022
4. A. Radford, J.W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, et al., Learning transferable visual models from natural language supervision, in: International Conference on Machine Learning, (ICML), 2021, pp. 8748–8763.
5. C. Jia, Y. Yang, Y. Xia, Y.-T. Chen, Z. Parekh, H. Pham, Q. Le, Y.-H. Sung, Z. Li, T. Duerig, Scaling up visual and vision-language representation learning with noisy text supervision, in: International Conference on Machine Learning, (ICML), 2021, pp. 4904–4916.