Funder
National Natural Science Foundation of China
Subject
Artificial Intelligence,Computer Vision and Pattern Recognition,Signal Processing,Software
Reference39 articles.
1. Temporally grounding natural sentence in video;Chen,2018
2. S. Zhang, H. Peng, J. Fu, J. Luo, Learning 2D Temporal Adjacent Networks for Moment Localization with Natural Language, in: Proceedings of the AAAI Conference on Artificial Intelligence, 2020, pp. 12870–12877.
3. R. Zeng, H. Xu, W. Huang, P. Chen, M. Tan, C. Gan, Dense Regression Network for Video Grounding, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020, pp. 10284–10293.
4. Leveraging auxiliary image descriptions for dense video captioning;Boran;Pattern Recognit. Lett.,2021
5. Diverse video captioning through latent variable expansion;Xiao;Pattern Recognit. Lett.,2022