1. Robust bi-tempered logistic loss based on bregman divergences;Amid;Advances in Neural Information Processing Systems,2019
2. Beit: Bert pre-training of image transformers;Bao,2021
3. On the opportunities and risks of foundation models;Bommasani,2021
4. Language models are few-shot learners;Brown;Advances in neural information processing systems,2020