Author:
Tang Binghao,Lin Boda,Chang Zheng,Li Si
Reference37 articles.
1. Doubly-attentive decoder for multi-modal neural machine translation;I Calixto;Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics,2017
2. Global inference for sentence compression: An integer linear programming approach;J Clarke;Journal of Artificial Intelligence Research,2008
3. An image is worth 16x16 words: Transformers for image recognition at scale;A Dosovitskiy;International Conference on Learning Representations,2020
4. Multimodal summarization of meeting recordings;B Erol;2003 International Conference on Multimedia and Expo. ICME'03. Proceedings,2003
5. Multimodal saliency and fusion for movie summarization based on aural, visual, and textual attention;G Evangelopoulos;IEEE Transactions on Multimedia,2013