1. Crosstransformers: spatially-aware few-shot transfer;doersch;Advances in neural information processing systems,2020
2. Vimpac: Video pre-training via masked token prediction and contrastive learning;tan;ArXiv Preprint,2021
3. Bert: Pre-training of deep bidirectional transformers for language understanding;devlin;ArXiv Preprint,2018
4. Learning to Compare: Relation Network for Few-Shot Learning
5. Maskclip: Masked self-distillation advances contrastive language-image pretraining;dong;ArXiv Preprint,2022