1. A better use of audio-visual cues: Dense video captioning with bi-modal transformer;Iashin;Proceedings of the 31st British Machine Vision Conference,2021
2. Multi-modal dense video captioning;Iashin;Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops,2020
3. End-to-end dense video captioning with parallel decoding;Wang;Proceedings of the IEEE/CVF International Conference on Computer Vision,2021
4. How blind people interact with visual content on social networking services;Voykinska;Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing,2016
5. Movie Description