1. A closer look at referring expressions for video object segmentation;Miriam Bellver;Multimedia Tools and Applications,2023
2. Language models are fewshot learners;Tom Brown;Advances in neural information processing systems,2020
3. X-detr: A versatile architecture for instance-wise vision-language tasks;Zhaowei Cai;European Conference on Computer Vision,2022
4. Cascade r-cnn: Delving into high quality object detection;Zhaowei Cai;Proceedings of the IEEE conference on computer vision and pattern recognition,2018
5. End-to-end object detection with transformers;Nicolas Carion;European conference on computer vision,2020