1. George Awad, Duy-Dinh Le, Chong-Wah Ngo, Vinh-Tiep Nguyen, Georges Quénot, Cees Snoek, and Shin'ichi Satoh. 2017. Video Indexing, Search, Detection, and Description with Focus on TRECVID. In Proceedings of the ACM International Conference on Multimedia Retrieval. 3--4.
2. Jianfeng Dong, Xirong Li, and Cees Snoek. 2018. Predicting Visual Features from Text for Image and Video Caption Retrieval. IEEE Transactions on Multimedia, Vol. 20, 12 (2018), 3377--3388.
3. Jianfeng Dong, Xirong Li, Chaoxi Xu, Shouling Ji, Yuan He, Gang Yang, and Xun Wang. 2019. Dual Encoding for Zero-Example Video Retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9346--9355.
4. Weiyu Lan, Xirong Li, and Jianfeng Dong. 2017. Fluency-Guided Cross-Lingual Image Captioning. In Proceedings of the 25th ACM International Conference on Multimedia. 1549--1557.
5. Xirong Li, Chaoxi Xu, Xiaoxu Wang, Weiyu Lan, Zhengxiong Jia, Gang Yang, and Jieping Xu. 2019. COCO-CN for Cross-Lingual Image Tagging, Captioning and Retrieval. IEEE Transactions on Multimedia, Vol. 21, 9 (2019), 2347--2360.