1. Video Summarization Using Deep Neural Networks: A Survey
2. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
3. Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, and Sylvain Gelly. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).
4. Jiri Fajtl, Hajar Sadeghi Sokeh, Vasileios Argyriou, Dorothy Monekosso, and Paolo Remagnino. 2018. Summarizing videos with attention. In Asian Conference on Computer Vision. Springer, 39–54.
5. Junaid Ahmed Ghauri, Sherzod Hakimov, and Ralph Ewerth. 2021. Supervised Video Summarization Via Multiple Feature Sets with Parallel Attention. In 2021 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 1–6s.