Author:
Cheng Feng,Bertasius Gedas
Publisher
Springer Nature Switzerland
Reference63 articles.
1. Bagchi, A., Mahmood, J., Fernandes, D., Sarvadevabhatla, R.K.: Hear me out: fusional approaches for audio augmented temporal action localization. arXiv preprint arXiv:2106.14118 (2021)
2. Lecture Notes in Computer Science;Y Bai,2020
3. Bertasius, G., Wang, H., Torresani, L.: Is space-time attention all you need for video understanding, vol. 2, no. p. 4. arXiv preprint arXiv:2102.05095 (2021)
4. Caba Heilbron, F., Escorcia, V., Ghanem, B., Carlos Niebles, J.: Activitynet: a large-scale video benchmark for human activity understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 961–970 (2015)
5. Carreira, J., Zisserman, A.: Quo vadis, action recognition? a new model and the kinetics dataset. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6299–6308 (2017)
Cited by
36 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献