A Feature-space Multimodal Data Augmentation Technique for Text-video Retrieval-Reference-Cited by-同舟云学术

A Feature-space Multimodal Data Augmentation Technique for Text-video Retrieval

Published:2022-10-10 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the 30th ACM International Conference on Multimedia
language:
Short-container-title:

Author:

Falcon Alex¹,Serra Giuseppe²,Lanz Oswald³

Affiliation:

1. Fondazione Bruno Kessler & University of Udine, Trento, Italy

2. University of Udine, Udine, Italy

3. Free University of Bozen-Bolzano, Bolzano, Italy

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3503161.3548365

Reference72 articles.

1. Kfir Aberman , Mingyi Shi , Jing Liao , Dani Lischinski , Baoquan Chen , and Daniel Cohen-Or . 2019. Deep video-based performance cloning . In Computer Graphics Forum , Vol. 38 . Wiley Online Library , 219--233. Kfir Aberman, Mingyi Shi, Jing Liao, Dani Lischinski, Baoquan Chen, and Daniel Cohen-Or. 2019. Deep video-based performance cloning. In Computer Graphics Forum, Vol. 38. Wiley Online Library, 219--233.

2. Ricardo Baeza-Yates Berthier Ribeiro-Neto etal 1999. Modern information retrieval. Vol. 463. ACM press New York. Ricardo Baeza-Yates Berthier Ribeiro-Neto et al. 1999. Modern information retrieval. Vol. 463. ACM press New York.

3. Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval

4. Irwan Bello , William Fedus , Xianzhi Du , Ekin Dogus Cubuk , Aravind Srinivas, Tsung-Yi Lin, Jonathon Shlens, and Barret Zoph. 2021 . Revisiting resnets: Improved training and scaling strategies. Advances in Neural Information Processing Systems , Vol. 34 (2021). Irwan Bello, William Fedus, Xianzhi Du, Ekin Dogus Cubuk, Aravind Srinivas, Tsung-Yi Lin, Jonathon Shlens, and Barret Zoph. 2021. Revisiting resnets: Improved training and scaling strategies. Advances in Neural Information Processing Systems, Vol. 34 (2021).

5. Survey on Videos Data Augmentation for Deep Learning Models

Cited by 11 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Collaborative group: Composed image retrieval via consensus learning from noisy annotations;Knowledge-Based Systems;2024-09

2. Latent Filling: Latent Space Data Augmentation for Zero-Shot Speech Synthesis;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14

3. Text-Video Retrieval via Multi-Modal Hypergraph Networks;Proceedings of the 17th ACM International Conference on Web Search and Data Mining;2024-03-04

4. Semantic Fusion Augmentation and Semantic Boundary Detection: A Novel Approach to Multi-Target Video Moment Retrieval;2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV);2024-01-03

5. Adaptively Forget with Crossmodal and Textual Distillation for Class-Incremental Video Captioning;2024