Affiliation:
1. Department of Software Engineering University of Szeged Szeged Hungary
2. HUN‐REN‐SZTE Research Group on Artificial Intelligence Szeged Hungary
Abstract
AbstractRadiologic reports often contain misspellings that compromise report quality and pose challenges for machine understanding methods, which require syntactical correctness. General automatic misspell correction solutions are less effective in specialized documents, such as spinal radiologic reports, particularly in morphologically rich languages like Hungarian. Issues arise from complex conjugations and the modification of Latin terms per the rules of the native language. This study introduces a method for the automatic correction of these misspellings, utilizing the Hunspell software and field‐specific dictionaries. This approach, enhanced by linguistic analysis and statistical data, improves information retrieval, as demonstrated in machine‐learning‐based classification and rule‐based identification tasks. Notably, our method identified over 30% more valid errors than human annotators, highlighting its efficiency. We offer a primarily dictionary‐based solution for correcting highly specialized texts and explore the impact of nonword correction on machine understanding. This work underscores the significance of tailored spelling correction in enhancing text processing algorithms' accuracy.
Funder
Mesterséges Intelligencia Nemzeti Laboratórium
Innovációs és Technológiai Minisztérium
Reference44 articles.
1. How to Read Articles That Use Machine Learning
2. A comparison of deep learning performance against health‐care professionals in detecting diseases from medical imaging: a systematic review and meta‐analysis;Liu X;Lancet Digit Health,2019
3. A Scalable Natural Language Processing for Inferring BT-RADS Categorization from Unstructured Brain Magnetic Resonance Reports