1. On the opportunities and risks of foundation models;Bommasani,2021
2. J. Devlin, M.-W. Chang, K. Lee, BERT: Pre-training of deep bidirectional transformers for language understanding, in: Proceedings of NAACL-HLT, 2019, pp. 4171–4186.
3. Language models are few-shot learners;Brown,2020
4. Dinov2: Learning robust visual features without supervision;Oquab,2023
5. A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, in: Proceedings of the 25th International Conference on Advances in Neural Information Processing Systems, 2012, pp. 1097–1105.