From technical to understandable: Artificial Intelligence Large Language Models improve the readability of knee radiology reports

Author:

Butler James J.1,Puleo James2,Harrington Michael C.2,Dahmen Jari345,Rosenbaum Andrew J.2,Kerkhoffs Gino M. M. J.345,Kennedy John G.1ORCID

Affiliation:

1. Department of Orthopaedic Surgery, Foot and Ankle Division NYU Langone Health New York City New York USA

2. Albany Medical Center Albany New York USA

3. Department of Orthopaedic Surgery and Sports Medicine Amsterdam Movement Sciences, Amsterdam UMC University of Amsterdam, Location AMC Amsterdam The Netherlands

4. Academic Center for Evidence‐Based Sports Medicine, Amsterdam UMC Amsterdam The Netherlands

5. Amsterdam Collaboration for Health and Safety in Sports International Olympic Committee Research Center, Amsterdam UMC Amsterdam The Netherlands

Abstract

AbstractPurposeThe purpose of this study was to evaluate the effectiveness of an Artificial Intelligence‐Large Language Model (AI‐LLM) at improving the readability of knee radiology reports.MethodsReports of 100 knee X‐rays, 100 knee computed tomography (CT) scans and 100 knee magnetic resonance imaging (MRI) scans were retrieved. The following prompt command was inserted into the AI‐LLM: ‘Explain this radiology report to a patient in layman's terms in the second person:[Report Text]’. The Flesch–Kincaid reading level (FKRL) score, Flesch reading ease (FRE) score and report length were calculated for the original radiology report and the AI‐LLM generated report. Any ‘hallucination’ or inaccurate text produced by the AI‐LLM‐generated report was documented.ResultsStatistically significant improvements in mean FKRL scores in the AI‐LLM generated X‐ray report (12.7 ± 1.0–7.2 ± 0.6), CT report (13.4 ± 1.0–7.5 ± 0.5) and MRI report (13.5 ± 0.9–7.5 ± 0.6) were observed. Statistically significant improvements in mean FRE scores in the AI‐LLM generated X‐ray report (39.5 ± 7.5–76.8 ± 5.1), CT report (27.3 ± 5.9–73.1 ± 5.6) and MRI report (26.8 ± 6.4–73.4 ± 5.0) were observed. Superior FKRL scores and FRE scores were observed in the AI‐LLM‐generated X‐ray report compared to the AI‐LLM‐generated CT report and MRI report, p < 0.001. The hallucination rates in the AI‐LLM generated X‐ray report, CT report and MRI report were 2%, 5% and 5%, respectively.ConclusionsThis study highlights the promising use of AI‐LLMs as an innovative, patient‐centred strategy to improve the readability of knee radiology reports. The clinical relevance of this study is that an AI‐LLM‐generated knee radiology report may enhance patients' understanding of their imaging reports, potentially reducing the responder burden placed on the ordering physicians. However, due to the ‘hallucinations’ produced by the AI‐LLM‐generated report, the ordering physician must always engage in a collaborative discussion with the patient regarding both reports and the corresponding images.Level of EvidenceLevel IV.

Publisher

Wiley

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3