Affiliation:
1. NYU Langone Health, New York, USA
2. Albany Medical Center, NY, USA
Abstract
Background: The purpose of this study was to assess the effectiveness of an Artificial Intelligence-Large Language Model (AI-LLM) at improving the readability of hand and wrist radiology reports. Methods: The radiology reports of 100 hand and/or wrist radiographs, 100 hand and/or wrist computed tomography (CT) scans, and 100 hand and/or wrist magnetic resonance imaging (MRI) scans were extracted. The following prompt command was inserted into the AI-LLM: “Explain this radiology report to a patient in layman’s terms in the second person: [Report Text].” The report length, Flesch reading ease score (FRES), and Flesch-Kincaid reading level (FKRL) were calculated for the original radiology report and the AI-LLM–generated report. The accuracy of the AI-LLM report was assessed via a 5-point Likert scale. Any “hallucination” produced by the AI-LLM–generated report was recorded. Results: There was a statistically significant improvement in mean FRES scores and FKRL scores in the AI-LLM–generated radiograph report, CT report, and MRI report. For all AI-LLM–generated reports, the mean reading level improved to below an eighth-grade reading level. The mean Likert score for the AI-LLM–generated radiograph report, CT report, and MRI report was 4.1 ± 0.6, 3.9 ± 0.6, and 3.9 ± 0.7, respectively. The hallucination rate in the AI-LLM–generated radiograph report, CT report, and MRI report was 3%, 6%, and 6%, respectively. Conclusions: This study demonstrates that AI-LLM effectively improves the readability of hand and wrist radiology reports, underscoring the potential application of AI-LLM as a promising and innovative patient-centric strategy to improve patient comprehension of their imaging reports. Level of Evidence: IV.