Analysis of Context-Dependent Errors in the Medical Domain in Spanish: A Corpus-Based Study

Author:

Hernández Jésica López1ORCID,Molina Fernando Molina2,Almela Ángela1

Affiliation:

1. Universidad de Murcia, Spain

2. VÓCALI Sistemas Inteligentes, S.L., Murcia, Spain

Abstract

This corpus-based study aimed to investigate the presence of context-dependent linguistic errors in a corpus of clinical reports. The data were taken from a corpus comprising more than 2 million words and made up of clinical reports from emergency medicine, intensive care unit, general surgery, and psychiatry. Quantitative and qualitative analyses were carried out. A language model based on n-grams was developed for the detection of errors, parameters for the selection of cases were defined, and a classification tool was implemented. The findings indicated that emergency medicine was the medical specialty with the highest number of context-dependent errors and that the most frequent type of error was omission of written accent. Furthermore, the analysis revealed the presence of errors of competence due to the incorrect application of the linguistic norm of Spanish, phenomena of phonetic similarity, and composition of words; it is also worth noting that performance errors occurred due to rapid typing on the keyboard. This study constituted the first analysis and creation of a typology of context-dependent errors for the medical domain in Spanish. It contributed to the design of a module based on linguistic knowledge that can be used for the development and improvement of automatic correction systems that, in turn, are used for data processing in medicine.

Funder

Ministry of Education of Spain

Spanish National Research Agency

Publisher

SAGE Publications

Subject

General Social Sciences,General Arts and Humanities

Reference35 articles.

1. Aguilar Ruiz M. J. (2013). Las normas ortográficas y ortotipográficas de la nueva Ortografía de la lengua española (2010) aplicadas a las publicaciones biomédicas en español: una visión de conjunto [The orthographic and orthotypographic norms of the new “Ortografía de la lengua española” (2010) applied to biomedical publications in Spanish: An overview]. Panace@, 14(37), 101–120. https://www.tremedica.org/wp-content/uploads/n37-tribuna-MJAguilarRuiz.pdf

2. Real-Word Errors in Arabic Texts: A Better Algorithm for Detection and Correction

3. Balabaeva K., Funkner A., Kovalchuk S. (2020). Automated spelling correction for clinical text mining in Russian. Studies in Health Technology and Informatics, 270, 43–47. https://doi.org/10.3233/SHTI200119

4. Bello Gutiérrez P. (2016). Aprendiendo a redactar mejor tus informes [Learning to write better reports]. In AEPap (Ed.), Curso de Actualización de Pediatría (pp. 391–400). Lúa Ediciones 3.0.

5. Automatic Correction of Real-Word Errors in Spanish Clinical Texts

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3