1. Llama: Open and efficient foundation language models;Touvron,2023
2. Exploring the limits of transfer learning with a unified text-to-text transformer;Raffel;The Journal of Machine Learning Research,2020
3. An image is worth 16x16 words: Transformers for image recognition at scale;Dosovitskiy
4. BERT: Pre-training of deep bidirectional transformers for language understanding;Devlin;North American Chapter of the Association for Computational Linguistics,2019