1. Sami Abu-El-Haija Nisarg Kothari Joonseok Lee Paul Natsev George Toderici Balakrishnan Varadarajan and Sudheendra Vijayanarasimhan. 2016. Youtube-8m: A large-scale video classification benchmark. arXiv preprint arXiv:1609.08675 (2016). Sami Abu-El-Haija Nisarg Kothari Joonseok Lee Paul Natsev George Toderici Balakrishnan Varadarajan and Sudheendra Vijayanarasimhan. 2016. Youtube-8m: A large-scale video classification benchmark. arXiv preprint arXiv:1609.08675 (2016).
2. Humam Alwassel Dhruv Mahajan Lorenzo Torresani Bernard Ghanem and Du Tran. 2019. Self-Supervised Learning by Cross-Modal Audio-Video Clustering. arXiv preprint arXiv:1911.12667 (2019). Humam Alwassel Dhruv Mahajan Lorenzo Torresani Bernard Ghanem and Du Tran. 2019. Self-Supervised Learning by Cross-Modal Audio-Video Clustering. arXiv preprint arXiv:1911.12667 (2019).
3. Look, Listen and Learn
4. Objects that Sound