Inferring cancer disease response from radiology reports using large language models with data augmentation and prompting-Reference-Cited by-同舟云学术

Inferring cancer disease response from radiology reports using large language models with data augmentation and prompting

Published:2023-07-14 Issue:10 Volume:30 Page:1657-1664
ISSN:1067-5027
Container-title:Journal of the American Medical Informatics Association
language:en
Short-container-title:

Author:

Tan Ryan Shea Ying Cong¹²^ORCID,Lin Qian³,Low Guat Hwa¹,Lin Ruixi³,Goh Tzer Chew⁴,Chang Christopher Chu En⁴,Lee Fung Fung⁴,Chan Wei Yin⁴,Tan Wei Chong¹²,Tey Han Jieh¹,Leong Fun Loon¹,Tan Hong Qi⁵,Nei Wen Long⁵,Chay Wen Yee¹²,Tai David Wai Meng¹²,Lai Gillianne Geet Yi¹²,Cheng Lionel Tim-Ee²⁶,Wong Fuh Yong⁵,Chua Matthew Chin Heng⁷^ORCID,Chua Melvin Lee Kiang²⁵⁸,Tan Daniel Shao Weng¹⁹,Thng Choon Hua²¹⁰,Tan Iain Bee Huat¹²⁸,Ng Hwee Tou³

Affiliation:

1. Division of Medical Oncology, National Cancer Centre Singapore , Singapore

2. Duke-NUS Medical School , Singapore

3. Department of Computer Science, National University of Singapore , Singapore

4. Institute of Systems Science, National University of Singapore , Singapore

5. Division of Radiation Oncology, National Cancer Centre Singapore , Singapore

6. Department of Diagnostic Radiology, Singapore General Hospital, Singapore

7. Yong Loo Lin School of Medicine, National University of Singapore , Singapore

8. Data and Computational Science Core, National Cancer Centre Singapore , Singapore

9. Division of Clinical Trials and Epidemiological Sciences, National Cancer Centre Singapore , Singapore

10. Division of Oncologic Imaging, National Cancer Centre Singapore, Singapore

Abstract

Abstract Objective To assess large language models on their ability to accurately infer cancer disease response from free-text radiology reports. Materials and Methods We assembled 10 602 computed tomography reports from cancer patients seen at a single institution. All reports were classified into: no evidence of disease, partial response, stable disease, or progressive disease. We applied transformer models, a bidirectional long short-term memory model, a convolutional neural network model, and conventional machine learning methods to this task. Data augmentation using sentence permutation with consistency loss as well as prompt-based fine-tuning were used on the best-performing models. Models were validated on a hold-out test set and an external validation set based on Response Evaluation Criteria in Solid Tumors (RECIST) classifications. Results The best-performing model was the GatorTron transformer which achieved an accuracy of 0.8916 on the test set and 0.8919 on the RECIST validation set. Data augmentation further improved the accuracy to 0.8976. Prompt-based fine-tuning did not further improve accuracy but was able to reduce the number of training reports to 500 while still achieving good performance. Discussion These models could be used by researchers to derive progression-free survival in large datasets. It may also serve as a decision support tool by providing clinicians an automated second opinion of disease response. Conclusions Large clinical language models demonstrate potential to infer cancer disease response from radiology reports at scale. Data augmentation techniques are useful to further improve performance. Prompt-based fine-tuning can significantly reduce the size of the training dataset.

Funder

A*STAR

Singapore Health Services under the Singhealth Duke-NUS Oncology ACP Programme

Publisher

Oxford University Press (OUP)

Subject

Health Informatics

Link

https://academic.oup.com/jamia/article-pdf/30/10/1657/51770129/ocad133.pdf

Reference35 articles.

1. Rapid-learning system for cancer care;Abernethy;J Clin Oncol,2010

2. CancerLinQ: origins, implementation, and future directions;Rubinstein;JCO Clin Cancer Inform,2018