Reaching quality and efficiency with a parameter-efficient controllable sentence simplification approach

Author:

Menta Antonio1,Garcia-Serrano Ana1

Affiliation:

1. E.T.S.I. Informática (UNED) C. de Juan del Rosal, Madrid, Spain

Abstract

The task of Automatic Text Simplification (ATS) aims to transform texts to improve their readability and comprehensibility. Current solutions are based on Large Language Models (LLM). These models have high performance but require powerful computing resources and large amounts of data to be fine-tuned when working in specific and technical domains. This prevents most researchers from adapting the models to their area of study. The main contributions of this research are as follows: (1) proposing an accurate solution when powerful resources are not available, using the transfer learning capabilities across different domains with a set of linguistic features using a reduced size pre-trained language model (T5-small) and making it accessible to a broader range of researchers and individuals; (2) the evaluation of our model on two well-known datasets, Turkcorpus and ASSET, and the analysis of the influence of control tokens on the SimpleText corpus, focusing on the domains of Computer Science and Medicine. Finally, a detailed discussion comparing our approach with state-of-the-art models for sentence simplification is included.

Publisher

National Library of Serbia

Reference74 articles.

1. Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: A next-generation hyperparameter optimization framework (2019)

2. Alarcon, R., Moreno, L., Martínez, P., Macías, J.A.: Easier system. evaluating a spanish lexical simplification proposal with people with cognitive impairments. International Journal of Human-Computer Interaction 0(0), 1-15 (2022)

3. Althunayyan, S., Azmi, A.: Automated text simplification: A survey. ACM Computing Surveys 54, Article no. 43 (03 2021)

4. Alva-Manchego, F., Bingel, J., Paetzold, G., Scarton, C., Specia, L.: Learning how to simplify from explicit labeling of complex-simplified text pairs. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). pp. 295-305. Asian Federation of Natural Language Processing, Taipei, Taiwan (Nov 2017)

5. Alva-Manchego, F., Martin, L., Bordes, A., Scarton, C., Sagot, B., Specia, L.: ASSET: A dataset for tuning and evaluation of sentence simplification models with multiple rewriting transformations. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. pp. 4668-4679. Association for Computational Linguistics, Online (Jul 2020)

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3