1. Deep vit features as dense visual descriptors;Amir,2021
2. Label-efficient semantic segmentation with diffusion models;Baranchuk;ArXiv,2022
3. On the opportunities and risks of foundation models;Bommasani,2021
4. Language models are few-shot learners;Brown;Advances in neural information processing systems,2020
5. Emerging Properties in Self-Supervised Vision Transformers