Author:
xiong huiyu,Wang Lanxiao,Qiu Heqian,Zhao Taijin,Qiu Benliu,Li Hongliang
Reference73 articles.
1. Attend and interact: Higher-order object interactions for video understanding;C Y Ma;IEEE/CVF Conference on Computer Vision and Pattern Recognition,2018
2. Spatio-temporal dynamics and semantic attribute enriched visual encoding for video captioning;N Aafaq;IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019
3. Graph convolutional network meta-learning with multi-granularity pos guidance for video captioning;P Li;Neurocomputing,2022
4. Clip4clip: An empirical study of clip for end to end video clip retrieval and captioning;H Luo;Neurocomputing,2022
5. Hierarchical multimodal transformer to summarize videos;B Zhao;Neurocomputing,2022