1. A Survey of Vision-Language Pre-Trained Models
2. MedCLIP: Contrastive Learning from Unpaired Medical Images and Text
3. Self-supervised image-text pre-training with mixed data in chest x-rays;Wang,2021
4. Language models are few-shot learners;Brown;Advances in neural information processing systems,2020
5. Bert: Pre-training of deep bidirectional transformers for language understanding;Devlin,2018