Transformer versus traditional natural language processing: how much data is enough for automated radiology report classification?-Reference-Cited by-同舟云学术

Transformer versus traditional natural language processing: how much data is enough for automated radiology report classification?

Published:2023-09 Issue:1149 Volume:96 Page:
ISSN:0007-1285
Container-title:The British Journal of Radiology
language:en
Short-container-title:BJR

Author:

Yang Eric¹²,Li Matthew D³,Raghavan Shruti²,Deng Francis²,Lang Min²,Succi Marc D²,Huang Ambrose J²,Kalpathy-Cramer Jayashree⁴

Affiliation:

1. Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA

2. Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA

3. Department of Radiology and Diagnostic Imaging, University of Alberta, Edmonton, Alberta, Canada

4. Department of Ophthalmology, University of Colorado, Aurora, CO, USA

Abstract

Objectives: Current state-of-the-art natural language processing (NLP) techniques use transformer deep-learning architectures, which depend on large training datasets. We hypothesized that traditional NLP techniques may outperform transformers for smaller radiology report datasets. Methods: We compared the performance of BioBERT, a deep-learning-based transformer model pre-trained on biomedical text, and three traditional machine-learning models (gradient boosted tree, random forest, and logistic regression) on seven classification tasks given free-text radiology reports. Tasks included detection of appendicitis, diverticulitis, bowel obstruction, and enteritis/colitis on abdomen/pelvis CT reports, ischemic infarct on brain CT/MRI reports, and medial and lateral meniscus tears on knee MRI reports (7,204 total annotated reports). The performance of NLP models on held-out test sets was compared after training using the full training set, and 2.5%, 10%, 25%, 50%, and 75% random subsets of the training data. Results: In all tested classification tasks, BioBERT performed poorly at smaller training sample sizes compared to non-deep-learning NLP models. Specifically, BioBERT required training on approximately 1,000 reports to perform similarly or better than non-deep-learning models. At around 1,250 to 1,500 training samples, the testing performance for all models began to plateau, where additional training data yielded minimal performance gain. Conclusions: With larger sample sizes, transformer NLP models achieved superior performance in radiology report binary classification tasks. However, with smaller sizes (<1000) and more imbalanced training data, traditional NLP techniques performed better. Advances in knowledge: Our benchmarks can help guide clinical NLP researchers in selecting machine-learning models according to their dataset characteristics.

Publisher

Oxford University Press (OUP)

Subject

Radiology, Nuclear Medicine and imaging,General Medicine

Link

https://www.birpublications.org/doi/pdf/10.1259/bjr.20220769

Reference23 articles.

1. Natural Language Processing in Radiology: A Systematic Review

2. Natural Language–based Machine Learning Models for the Annotation of Clinical Radiology Reports

3. Assessment of Deep Natural Language Processing in Ascertaining Oncologic Outcomes From Radiology Reports

4. Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification

5. Automatic Disease Annotation From Radiology Reports Using Artificial Intelligence Implemented by a Recurrent Neural Network

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The More, the Better? Modalities of Metastatic Status Extraction on Free Medical Reports Based on Natural Language Processing;JCO Clinical Cancer Informatics;2024-08

2. Artificial intelligence in ischemic stroke images: current applications and future directions;Frontiers in Neurology;2024-07-10

3. Evaluation of a BERT Natural Language Processing Model for Automating CT and MRI Triage and Protocol Selection;Canadian Association of Radiologists Journal;2024-06-04

4. Probing the limit of hydrologic predictability with the Transformer network;Journal of Hydrology;2024-06

5. Artificial Intelligence–Assisted Cancer Status Detection in Radiology Reports;Cancer Research Communications;2024-04-09