Affiliation:
1. Vital Software, Inc., Auckland, New Zealand
2. Department of Emergency Medicine, Indiana University School of Medicine, Indianapolis, IN USA
Abstract
Complex medical terminology utilized in clinical documentation can present barriers to patients understanding their medical findings. We aimed to generate easy-to-understand summaries of clinical radiology reports using large language models (LLMs) and evaluate their safety and quality. Eight board-certified physician reviewers evaluated 1982 LLM-generated radiology report summaries (computed tomography, magnetic resonance imaging, ultrasound, and x-ray) for safety and quality, using predefined rating criteria and the corresponding original radiology reports for reference. Physician reviewers determined 99.2% (1967 out of 1982) of the LLM-generated summaries to be safe. The reviewers scored the quality of the LLM-generated summaries from “5—Very Good” to “1—Very Poor,” respectively, as follows: 80.6%, 11.1%, 5.7%, 1.7%, and 0.9%. Safety varied significantly across imaging modality ( P = .002). Large language models can be used to generate safe and high-quality summaries of clinical radiology reports. Further investigation is warranted to determine the impact of LLM-generated summaries on patient perception of understanding, knowledge of their medical conditions, and overall experience.