AttnGrounder: Talking to Cars with Attention-Reference-Cited by-同舟云学术

AttnGrounder: Talking to Cars with Attention

Published:2020 Issue: Volume: Page:62-73
ISSN:0302-9743
Container-title:Computer Vision – ECCV 2020 Workshops
language:
Short-container-title:

Author:

Mittal Vivek

Publisher

Springer International Publishing

Link

https://link.springer.com/content/pdf/10.1007/978-3-030-66096-3_6

Reference31 articles.

1. Anderson, P., et al.: Bottom-up and top-down attention for image captioning and visual question answering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6077–6086 (2018)

2. Caesar, H., et al.: Nuscenes: a multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11621–11631 (2020)

3. Chen, X., Ma, L., Chen, J., Jie, Z., Liu, W., Luo, J.: Real-time referring expression comprehension by single-stage grounding network. arXiv preprint arXiv:1812.03426 (2018)

4. Deng, C., Wu, Q., Wu, Q., Hu, F., Lyu, F., Tan, M.: Visual grounding via accumulated attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7746–7755 (2018)

5. Deruyttere, T., Collell, G., Moens, M.F.: Giving commands to a self-driving car: A multimodal reasoner for visual grounding. arXiv preprint arXiv:2003.08717 (2020)

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. GPT-4 enhanced multimodal grounding for autonomous driving: Leveraging cross-modal attention with large language models;Communications in Transportation Research;2024-12

2. Lgvc: language-guided visual context modeling for 3D visual grounding;Neural Computing and Applications;2024-04-23

3. Comprehensive survey on 3D visual-language understanding techniques;Journal of Image and Graphics;2024

4. Self-Supervised Learning based 3D Visual Question answering for Scene Understanding;2023 5th International Conference on Frontiers Technology of Information and Computer (ICFTIC);2023-11-17

5. Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding;2023 IEEE/CVF International Conference on Computer Vision (ICCV);2023-10-01