1. ViViT: A Video Vision Transformer
2. Jimmy Lei Ba , Jamie Ryan Kiros, and Geoffrey E Hinton . 2016 . Layer normalization. arXiv preprint arXiv:1607.06450 (2016). Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).
3. David Berthelot , Nicholas Carlini , Ian Goodfellow , Nicolas Papernot , Avital Oliver , and Colin A Raffel . 2019 . Mixmatch: A holistic approach to semi-supervised learning . In Advances in Neural Information Processing Systems , Vol. 32 . David Berthelot, Nicholas Carlini, Ian Goodfellow, Nicolas Papernot, Avital Oliver, and Colin A Raffel. 2019. Mixmatch: A holistic approach to semi-supervised learning. In Advances in Neural Information Processing Systems, Vol. 32.
4. Deep Clustering for Unsupervised Learning of Visual Features
5. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset