Video–text retrieval via multi-modal masked transformer and adaptive attribute-aware graph convolutional network-Reference-Cited by-同舟云学术

Video–text retrieval via multi-modal masked transformer and adaptive attribute-aware graph convolutional network

Published:2024-01-22 Issue:1 Volume:30 Page:
ISSN:0942-4962
Container-title:Multimedia Systems
language:en
Short-container-title:Multimedia Systems

Author:

Lv Gang,Sun Yining,Nian Fudong

Funder

University Synergy Innovation Program of Anhui Province

Anhui Provincial Key Research and Development Program

National Natural Science Foundation of China

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s00530-023-01205-8.pdf

Reference51 articles.

1. Amrani, E., Ben-Ari, R., Rotman, D., et al: Noise estimation using density estimation for self-supervised multimodal learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 6644–6652 (2021)

2. Bain, M., Nagrani, A., Varol, G., et al: Frozen in time: A joint video and image encoder for end-to-end retrieval. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 1728–1738 (2021)

3. Barraco, M., Cornia, M., Cascianelli, S., et al: The unreasonable effectiveness of clip features for image captioning: an experimental analysis. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4662–4670 (2022)

4. Bogolin, S.V., Croitoru, I., Jin, H., et al: Cross modal retrieval with querybank normalisation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5194–5205 (2022)

5. Chen, D., Dolan, W.B.: Collecting highly parallel data for paraphrase evaluation. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, pp 190–200 (2011)

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Repeat and learn: Self-supervised visual representations learning by Repeated Scene Localization;Pattern Recognition;2024-12