Oversea Cross-Lingual Summarization Service in Multilanguage Pre-Trained Model through Knowledge Distillation
-
Published:2023-12-14
Issue:24
Volume:12
Page:5001
-
ISSN:2079-9292
-
Container-title:Electronics
-
language:en
-
Short-container-title:Electronics
Author:
Yang Xiwei1, Yun Jing1, Zheng Bofei1, Liu Limin1, Ban Qi1
Affiliation:
1. School of Data Science and Applications, Inner Mongol University of Technology, Hohhot 010080, China
Abstract
Cross-lingual text summarization is a highly desired service for overseas report editing tasks and is formulated in a distributed application to facilitate the cooperation of editors. The multilanguage pre-trained language model (MPLM) can generate high-quality cross-lingual text summaries with simple fine-tuning. However, the MPLM does not adapt to complex variations, like the word order and tense in different languages. When the model performs on these languages with separate syntactic structures and vocabulary morphologies, it will lead to the low-level quality of the cross-lingual summary. The matter worsens when the cross-lingual summarization datasets are low-resource. We use a knowledge distillation framework for the cross-lingual summarization task to address the above issues. By learning the monolingual teacher model, the cross-lingual student model can effectively capture the differences between languages. Since the teacher and student models generate summaries in two languages, their representations lie on different vector spaces. In order to construct representation relationships across languages, we further propose a similarity metric, which is based on bidirectional semantic alignment, to map different language representations to the same space. In order to improve the quality of cross-lingual summaries further, we use contrastive learning to make the student model focus on the differentials among languages. Contrastive learning can enhance the ability of the similarity metric for bidirectional semantic alignment. Our experiments show that our approach is competitive in low-resource scenarios on cross-language summarization datasets in pairs of distant languages.
Funder
National Natural Science Foundation of China Basic Scientific Research Expenses Program of Universities directly under Inner Mongolia Autonomous Region Teaching Reform Project for Postgraduate Education under the Inner Mongolia University of Technology
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference31 articles.
1. Multilingual denoising pre-training for neural machine translation;Liu;Trans. Assoc. Comput. Linguist.,2020 2. Xue, L., Constant, N., Roberts, A., Kale, M., Al-Rfou, R., Siddhant, A., and Raffel, C. (2020). mT5: A massively multilingual pre-trained text-to-text transformer. arXiv. 3. Liang, Y., Meng, F., Zhou, C., Xu, J., Chen, Y., Su, J., and Zhou, J. (2022). A variational hierarchical model for neural cross-lingual summarization. arXiv. 4. What do language representations really represent?;Bjerva;Comput. Linguist.,2019 5. Alaux, J., Grave, E., Cuturi, M., and Joulin, A. (2018). Unsupervised hyperalignment for multilingual word embeddings. arXiv.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|