Combining permuted language model and adversarial training for Chinese machine reading comprehension

Author:

Liu Jianping12,Chu Xintao1,Wang Jian3,Wang Meng1,Wang Yingfei1

Affiliation:

1. College of Computer Science and Engineering, North Minzu University, Yinchuan, China

2. The Key Laboratory of Images and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan, China

3. Agricultural Information Institute, Chinese Academy of Agricultural Sciences, Beijing, China

Abstract

Due to the polysemy and complexity of the Chinese language, Chinese machine reading comprehension has always been a challenging task. To improve the semantic understanding and robustness of Chinese machine reading comprehension models, we propose a model that utilizes adversarial training algorithms and Permuted Language Model (PERT). Firstly, we employ the PERT pre-training model to embed paragraphs and questions into vector space to obtain corresponding sequential representations. Secondly, we use a multi-head self-attention mechanism to extract key textual information from the sequence and employ a Bi-GRU network to semantically fuse the output feature vectors, aiming to learn deep semantic representations in the text. Finally, we introduce perturbations into the model training process. We achieve this by utilizing adversarial training algorithms such as Fast Gradient Method (FGM) and Projected Gradient Descent (PGD). These algorithms generate adversarial samples to enhance the model’s robustness and stability when facing diverse inputs. We conducted comparative experiments on the publicly available Chinese reading comprehension datasets CMRC2018 and DRCD. The experimental results show that our proposed model has achieved significant improvements in both EM and F1-Score compared to the baseline model. To validate the model’s generalization and robustness, we utilized ChatGPT to construct a scientific dataset that includes a large number of domain-specific terms, sentences with mixed Chinese and English, and complex comprehension tasks. Our model also performed remarkably well on the self-built dataset. In conclusion, the proposed model not only effectively enhances the understanding of semantic information in Chinese text but also demonstrates a certain level of generalization capability.

Publisher

IOS Press

Reference10 articles.

1. Hermann K.M. , Kocisky T. , Grefenstette E. , Espeholt L. , Kay W. , Suleyman M. , Blunsom P. , Teaching machines to read and comprehend, Advances in Neural Information Processing Systems 28 (2015).

2. Yang Z. , Dai Z. , Yang Y. , Carbonell J. , Salakhutdinov R.R. , Le Q.V. , Xlnet: Generalized autoregressive pretraining for language understanding, , Advances in Neural Information Processing Systems 32 (2019).

3. Improving the robustness of machine reading comprehension via contrastive learning;Feng;Applied Intelligence,2023

4. Pre-training with whole word masking for chinese bert,;Cui;IEEE/ACM Transactions on Audio, Speech, and Language Processing,2021

5. Yu C. and Li X. , SSAG-Net: Syntactic and semantic attention-guided machine reading comprehension, Intelligent Automation & Soft Computing 34(3) (2022).

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3