Transfer learning with BERT and ClinicalBERT models for multiclass classification of radiology imaging reports

Author:

Mithun Sneha1,Sherkhane Umesh B.1,Jha Ashish Kumar1,Shah Sneha2,Purandare Nilendu C2,Rangarajan V.2,Dekker A.1,Bermejo Inigo1,Wee L.1

Affiliation:

1. Maastricht University Medical Centre+

2. Tata Memorial Hospital

Abstract

Abstract

This study assessed the use of pre-trained language models for classifying cancer types as lung (class1), esophageal (class2), and other cancer (class0) in radiology reports. We compared BERT, a general-purpose model, with ClinicalBERT, a clinical domain-specific model. The models were trained on radiology reports from our hospital and validated on a hold-out set from the same hospital and a public dataset (MIMIC-III). We used 4064 hospital radiology reports: 3902 for training (which were further divided into a 70:30 random train–test split) and 162 as a hold-out set. 542 reports from MIMIC-III were used for independent external validation. The ground-truth labels were generated by two expert radiologists independently. The F1 score for the classes 0, 1, and 2 on internal validation were 0.62, 0.87, and 0.90 for BERT, and 0.93, 0.97, and 0.97 for ClinicalBERT respectively. External validation F1 score for the classes 0, 1, and 2 were 0.66, 0.37, and 0.46 and for BERT, and 0.68, 0.50, and 0.64 for ClinicalBERT respectively. ClinicalBERT outperformed BERT demonstrating the benefit of domain-specific pre-training for this task. The higher accuracy for lung cancer might be due to imbalanced data with more lung cancer reports.

Publisher

Springer Science and Business Media LLC

Reference42 articles.

1. Case studies for overcoming challenges in using big data in cancer;Sweeney SM;Cancer Research,2023

2. What Is Cancer Research? American Association for Cancer Research (AACR) https://www.aacr.org/patients-caregivers/about-cancer/what-is-cancer-research/.

3. Big data takes on cancer.

4. Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning. (The MIT Press, Cambridge, Massachusetts, 2016).

5. Pestian, J. P. et al. A shared task involving multi-label classification of clinical free text. in Proceedings of the Workshop on BioNLP 2007 Biological, Translational, and Clinical Language Processing - BioNLP ’07 97 (Association for Computational Linguistics, Prague, Czech Republic, 2007). doi:10.3115/1572392.1572411.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3