1. Generative pretraining from pixels;chen;Proc Int Conf Mach Learn,2020
2. VL-BERT: Pre-training of generic visual-linguistic representations;su;Proc Int Conf Learn Representations,2020
3. BEiT: Bert pre-training of image transformers;bao;Proc Int Conf Learn Representations,2022
4. An image is worth 16x16 words: Transformers for image recognition at scale;dosovitskiy;Proc Int Conf Learn Representations,2020
5. Bootstrap your own latent: A new approach to self-supervised learning;grill;Proc Adv Neural Inf Process Syst,2020