1. Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks;Lu;Adv. Neural Inf. Process. Syst.,2019
2. D.-K. Nguyen, T. Okatani, Multi-task learning of hierarchical vision-language representation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 10492–10501.
3. T. Gupta, A. Kamath, A. Kembhavi, D. Hoiem, Towards general purpose vision systems: An end-to-end task-agnostic vision-language architecture, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 16399–16409.
4. K. Lin, H.-F. Yang, J.-H. Hsiao, C.-S. Chen, Deep learning of binary hash codes for fast image retrieval, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2015, pp. 27–35.
5. BinGAN: Learning compact binary descriptors with a regularized gan;Zieba;Adv. Neural Inf. Process. Syst.,2018