Funder
Ministry of Science and Technology, Croatia
National Natural Science Foundation of China
Reference52 articles.
1. Lavender: Unifying video-language understanding as masked language modeling;Li,2022
2. Interaction-integrated network for natural language moment localization;Ning;IEEE Trans. Image Process.,2021
3. HiTeA: Hierarchical temporal-aware video-language pre-training;Ye,2022
4. Merlot: Multimodal neural script knowledge models;Zellers;Adv. Neural Inf. Process. Syst.,2021
5. Natural language video localization: A revisit in span-based question answering framework;Zhang;IEEE Trans. Pattern Anal. Mach. Intell.,2021