1. Alberto Baldrati , Lorenzo Agnolucci , Marco Bertini , and Alberto Del Bimbo . 2023. Zero-Shot Composed Image Retrieval with Textual Inversion. arXiv preprint arXiv:2303.15247 ( 2023 ). Alberto Baldrati, Lorenzo Agnolucci, Marco Bertini, and Alberto Del Bimbo. 2023. Zero-Shot Composed Image Retrieval with Textual Inversion. arXiv preprint arXiv:2303.15247 (2023).
2. Effective conditioned and composed image retrieval combining CLIP-based features
3. Microsoft COCO: Common Objects in Context
4. Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models
5. Alec Radford , Jong Wook Kim , Chris Hallacy , Aditya Ramesh , Gabriel Goh , Sandhini Agarwal , Girish Sastry , Amanda Askell , Pamela Mishkin , Jack Clark , 2021 . Learning transferable visual models from natural language supervision . In Proc. of International Conference on Machine Learning (ICML). PMLR, 8748--8763 . Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In Proc. of International Conference on Machine Learning (ICML). PMLR, 8748--8763.