1. RT-2: Vision-language-action models transfer web knowledge to robotic control;Brohan,2023
2. BC-Z: Zero-shot task generalization with robotic imitation learning;Jang,2021
3. Perceiver-actor: A multi-task transformer for robotic manipulation;Shridhar,2022
4. Act3D: 3D feature field transformers for multi-task robotic manipulation;Gervet,2023
5. Open-world object manipulation using pre-trained vision-language model;Stone,2023