Multi-grained visual pivot-guided multi-modal neural machine translation with text-aware cross-modal contrastive disentangling
-
Published:2024-10
Issue:
Volume:178
Page:106403
-
ISSN:0893-6080
-
Container-title:Neural Networks
-
language:en
-
Short-container-title:Neural Networks
Author:
Guo JunjunORCID, Su Rui, Ye JunjieORCID
Reference58 articles.
1. Doubly attentive transformer machine translation;Arslan,2018 2. Caglayan, O., Aransa, W., Wang, Y., Masana, M., García-Martínez, M., Bougares, F., et al. (2016). Does Multimodality Help Human and Machine for Translation and Image Captioning?. In Proceedings of the first conference on machine translation: volume 2, shared task papers (pp. 627–633). 3. Probing the need for visual context in multimodal machine translation;Caglayan,2019 4. Calixto, I., & Liu, Q. (2017). Incorporating Global Visual Features into Attention-based Neural Machine Translation. In Proceedings of the 2017 conference on empirical methods in natural language processing (pp. 992–1003). 5. Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P., et al. (2021). Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 9650–9660).
|
|