1. Jean-Baptiste Alayrac, Piotr Bojanowski, Nishant Agrawal, Josef Sivic, Ivan Laptev, and Simon Lacoste-Julien. 2016. Unsupervised learning from narrated instruction videos. In CVPR.
2. Max Bain, Arsha Nagrani, Andrew Brown, and Andrew Zisserman. 2020. Condensed movies: Story based retrieval with contextual embeddings. In ACCV.
3. Yi Bin, Xindi Shang, Bo Peng, Yujuan Ding, and Tat-Seng Chua. 2021. Multi-perspective video captioning. In ACM Multimedia.
4. Hervé Bredin, Claude Barras, and Camille Guinaudeau. 2016. Multimodal person discovery in broadcast TV at MediaEval 2016. In MediaEval 2016.
5. Andrew Brown, Ernesto Coto, and Andrew Zisserman. 2021. Automated video labelling: Identifying faces by corroborative evidence. In IEEE-MIPR.