Advancing Weakly-Supervised Audio-Visual Video Parsing via Segment-Wise Pseudo Labeling-Reference-Cited by-同舟云学术

Advancing Weakly-Supervised Audio-Visual Video Parsing via Segment-Wise Pseudo Labeling

Published:2024-06-09 Issue: Volume: Page:
ISSN:0920-5691
Container-title:International Journal of Computer Vision
language:en
Short-container-title:Int J Comput Vis

Author:

Zhou Jinxing,Guo Dan^ORCID,Zhong Yiran,Wang Meng

Funder

National Key R &D Program of China

National Natural Science Foundation of China

Major Project of Anhui Province

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s11263-024-02142-3.pdf

Reference87 articles.

1. Afouras, T., Owens, A., Chung, J. S., & Zisserman, A. (2020). Self-supervised learning of audio-visual objects from video. In Proceedings of the European conference on computer vision (ECCV) (pp. 208–224).

2. Alayrac, J. B., Donahue, J., Luc, P., Miech, A., Barr, I., Hasson, Y., Lenc, K., Mensch, A., Millican, K., Reynolds, M., et al. (2022) Flamingo: A visual language model for few-shot learning. arXiv:2204.14198

3. Arandjelovic, R., & Zisserman, A. (2017). Look, listen and learn. In Proceedings of the IEEE international conference on computer vision (ICCV) (pp. 609–617).

4. Arandjelovic, R., & Zisserman, A. (2018). Objects that sound. In Proceedings of the European conference on computer vision (ECCV) (pp. 435–451).

5. Barraco, M., Cornia, M., Cascianelli, S., Baraldi, L., & Cucchiara, R. (2022). The unreasonable effectiveness of clip features for image captioning: An experimental analysis. In Workshops of proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 4662–4670).

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. VADS: Visuo-Adaptive DualStrike attack on visual question answer;Computer Vision and Image Understanding;2024-08