Automated labelling of radiology reports using natural language processing: Comparison of traditional and newer methods-Reference-Cited by-同舟云学术

Automated labelling of radiology reports using natural language processing: Comparison of traditional and newer methods

Published:2023-04 Issue:2 Volume:2 Page:120-128
ISSN:2771-1757
Container-title:Health Care Science
language:en
Short-container-title:Health Care Science

Author:

Chng Seo Yi¹^ORCID,Tern Paul J. W.²^ORCID,Kan Matthew R. X.³^ORCID,Cheng Lionel T. E.⁴

Affiliation:

1. Department of Paediatrics National University of Singapore Singapore Singapore

2. Department of Cardiology National Heart Centre Singapore Singapore

3. NUS High School of Mathematics and Science Singapore Singapore

4. Department of Diagnostic Radiology Singapore General Hospital Singapore Singapore

Abstract

AbstractAutomated labelling of radiology reports using natural language processing allows for the labelling of ground truth for large datasets of radiological studies that are required for training of computer vision models. This paper explains the necessary data preprocessing steps, reviews the main methods for automated labelling and compares their performance. There are four main methods of automated labelling, namely: (1) rules‐based text‐matching algorithms, (2) conventional machine learning models, (3) neural network models and (4) Bidirectional Encoder Representations from Transformers (BERT) models. Rules‐based labellers perform a brute force search against manually curated keywords and are able to achieve high F1 scores. However, they require proper handling of negative words. Machine learning models require preprocessing that involves tokenization and vectorization of text into numerical vectors. Multilabel classification approaches are required in labelling radiology reports and conventional models can achieve good performance if they have large enough training sets. Deep learning models make use of connected neural networks, often a long short‐term memory network, and are similarly able to achieve good performance if trained on a large data set. BERT is a transformer‐based model that utilizes attention. Pretrained BERT models only require fine‐tuning with small data sets. In particular, domain‐specific BERT models can achieve superior performance compared with the other methods for automated labelling.

Publisher

Wiley

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1002/hcs2.40

Reference40 articles.

1. Preparing Medical Imaging Data for Machine Learning

2. Practical Guide to Natural Language Processing for Radiology

3. A comprehensive review on resolving ambiguities in natural language processing

4. Challenges in clinical natural language processing for automated disorder normalization

5. Evaluation of negation phrases in narrative clinical reports;Chapman WW;Proc AMIA Symp,2001

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Efficiency, accuracy, and health professional's perspectives regarding artificial intelligence in radiology practice: A scoping review;iRADIOLOGY;2024-04

2. A system for feature extraction and classification of ovarian CT radiology reports;2023 6th International Conference on Advances in Science and Technology (ICAST);2023-12-08

3. Knowledge graph-based small sample learning for label of medical imaging reports;2023-09-27

4. BioSignal Copilot: Leveraging the power of LLMs in drafting reports for biomedical signals;2023-07-06