1. Conditioned and composed image retrieval combining and partially fine-tuning CLIP-based features
2. Bingyi Cao , Andre Araujo , and Jack Sim . 2020 . Unifying deep local and global features for image search. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020 , Proceedings, Part XX 16 . Springer, 726–743. Bingyi Cao, Andre Araujo, and Jack Sim. 2020. Unifying deep local and global features for image search. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XX 16. Springer, 726–743.
3. CLIP-Art: Contrastive Pre-training for Fine-Grained Art Classification
4. EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
5. Class Weighted Convolutional Features for Visual Instance Search