Multimodal Pretraining from Monolingual to Multilingual-Reference-Cited by-同舟云学术

Multimodal Pretraining from Monolingual to Multilingual

Published:2023-03-31 Issue:2 Volume:20 Page:220-232
ISSN:2731-538X
Container-title:Machine Intelligence Research
language:en
Short-container-title:Mach. Intell. Res.

Author:

Zhang Liang^ORCID,Ruan Ludan,Hu Anwen,Jin Qin^ORCID

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Artificial Intelligence,Computer Networks and Communications,Computer Science Applications,Computer Vision and Pattern Recognition,Modeling and Simulation,Signal Processing,Control and Systems Engineering

Link

https://link.springer.com/content/pdf/10.1007/s11633-022-1414-4.pdf

Reference44 articles.

1. H. Zhu, M. D. Luo, R. Wang, A. H. Zheng, R. He. Deep audio-visual learning: A survey. International Journal of Automation and Computing, vol. 18, no. 3, pp. 351–376, 2021. DOI: https://doi.org/10.1007/s11633-021-1293-0.

2. L. W. Zhou, H. Palangi, L. Zhang, H. D. Hu, J. Corso, J. F. Gao. Unified vision-language pre-training for image captioning and VQA. In Proceedings of AAAI Conference on Artificial Intelligence, vol. 34, no. 7, pp. 13041–13049, 2020. DOI: https://doi.org/10.1609/aaai.v34i07.7005.

3. Y. C. Chen, L. J. Li, L. C. Yu, A. El Kholy, F. Ahmed, Z. Gan, Y. Cheng, J. J. Liu. UNITER: Universal image-text representation learning. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 104–120, 2020. DOI: https://doi.org/10.1007/978-3-030-58577-8_7.

4. A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, I. Sutskever. Learning transferable visual models from natural language supervision. In Proceedings of the 38th International Conference on Machine Learning, 2021.

5. N. Reimers, I. Gurevych. Making monolingual sentence embeddings multilingual using knowledge distillation. In Proceedings of Conference on Empirical Methods in Natural Language Processing, pp. 4512–4525, 2020. DOI: https://doi.org/10.18653/v1/2020.emnlp-main.365.