A Joint Modeling of Vision-Language-Action for Target-oriented Grasping in Clutter-Reference-Cited by-同舟云学术

A Joint Modeling of Vision-Language-Action for Target-oriented Grasping in Clutter

Published:2023-05-29 Issue: Volume: Page:
ISSN:
Container-title:2023 IEEE International Conference on Robotics and Automation (ICRA)
language:
Short-container-title:

Author:

Xu Kechun¹,Zhao Shuqi¹,Zhou Zhongxiang¹,Li Zizhang¹,Pi Huaijin¹,Zhu Yifeng²,Wang Yue¹,Xiong Rong¹

Affiliation:

1. Zhejiang University,Hangzhou,China

2. University of Texas at Austin,United States

Publisher

IEEE

Link

http://xplorestaging.ieee.org/ielx7/10160211/10160212/10161041.pdf?arnumber=10161041

Reference49 articles.

1. Vlmbench: A compositional benchmark for vision-and-language manipulation;zheng;ArXiv Preprint,20

2. Cliport: What and where pathways for robotic manipulation;shridhar;Conference on Robot Learning,0

3. Interactive visual grounding of re-ferring expressions for human-robot interaction;shridhar;ArXiv Preprint,20

4. Interactively picking real-world objects with un-constrained spoken language instructions;hatori;2018 IEEE International Conference on Robotics and Automation (ICRA),0

5. Inner monologue: Embodied reasoning through planning with language models;huang;ArXiv Preprint,20

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Object-Centric Inference for Language Conditioned Placement: A Foundation Model based Approach;2023 International Conference on Advanced Robotics and Mechatronics (ICARM);2023-07-08

2. Language Guided Grasping of Unknown Concepts Based on Knowledge System;Intelligent Robotics and Applications;2023