1. A good prompt is worth millions of pa-rameters? low-resource prompt-based learning for vision-language models;jin;ArXiv Preprint,2021
2. Few-shot learning with retrieval augmented language models;izacard;ArXiv Preprint,2022
3. Deep visual-semantic align-ments for generating image descriptions;karpathy;Proceedings of the IEEE Conference on Computer Vision and Pattern Recog-nition,0
4. Billion-scale similarity search with gpus;johnson;ArXiv Preprint,2017
5. Scaling Up Vision-Language Pretraining for Image Captioning