1. S. Abu-El-Haija, N. Kothari, J. Lee, P. Natsev, G. Toderici, B. Varadarajan, S. Vijayanarasimhan, Youtube-8M: a large-scale video classification benchmark, (2016) 1–10. arXiv preprint: 1609.08675
2. Two-stream convolutional networks for action recognition in videos;Simonyan,2014
3. Long-term recurrent convolutional networks for visual recognition and description;Donahue,2015
4. Faster recurrent networks for efficient video classification;Zhu,2020
5. Quo vadis, action recognition? A new model and the kinetics dataset;Carreira,2017