Author:
Liang Mingliang,Liu Zhouran,Larson Martha
Publisher
Springer Nature Switzerland
Reference36 articles.
1. Brown, K.S., et al.: Investigating the extent to which distributional semantic models capture a broad range of semantic relations. Cogn. Sci. 47(5), e13291 (2023)
2. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.E.: A simple framework for contrastive learning of visual representations. In: ICML, vol. 119, pp. 1597–1607 (2020)
3. Cherti, M., et al.: Reproducible scaling laws for contrastive language-image learning. In: CVPR, pp. 2818–2829 (2023)
4. Dosovitskiy, A., et al.: An image is worth 16 $$\times $$ 16 words: transformers for image recognition at scale. In: ICLR (2021)
5. Dou, Z., et al.: An empirical study of training end-to-end vision-and-language transformers. In: CVPR, pp. 18145–18155 (2022)