@ CREPE: Can Vision-Language Foundation Models Reason Compositionally?-Reference-Cited by-同舟云学术

@ CREPE: Can Vision-Language Foundation Models Reason Compositionally?

Published:2023-06 Issue: Volume: Page:
ISSN:
Container-title:2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
language:
Short-container-title:

Author:

Ma Zixian¹,Hong Jerry¹,Gul Mustafa Omer²,Gandhi Mona³,Gao Irena¹,Krishna Ranjay⁴

Affiliation:

1. Stanford University

2. Cornell University

3. University of Pennsylvania

4. University of Washington

Publisher

IEEE

Link

http://xplorestaging.ieee.org/ielx7/10203037/10203050/10205135.pdf?arnumber=10205135

Reference83 articles.

1. Can foundation models perform zero-shot task specification for robot manipulation?;cui;Proceedings of The 4th Annual Learning for Dynamics and Control Conference volume 168 of Proceedings of Machine Learning Research,2022

2. Learning transferable visual models from natural language supervision;radford;International Conference on Machine Learning,0

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. CUPID: Contextual Understanding of Prompt‐conditioned Image Distributions;Computer Graphics Forum;2024-06

2. Lang3DSG: Language-based contrastive pre-training for 3D Scene Graph prediction;2024 International Conference on 3D Vision (3DV);2024-03-18

3. Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative Mining;2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV);2024-01-03

4. Evaluating CLIP’s Understanding on Relationships in a Blocks World;2023 IEEE International Conference on Big Data (BigData);2023-12-15

5. Breaking Boundaries Between Linguistics and Artificial Intelligence;Journal of Organizational and End User Computing;2023-11-21