Abstract
AbstractIntroductionDuring the last few years, we have witnessed a surge in the utilization of Large Language Models (LLMs) for diverse applications in clinical medicine. Their utility extends to enhancing ECG interpretation, data analysis, and risk prediction in cardiology. This study aims to evaluate the accuracy of LLMs in answering cardiology-specific questions of various difficulty levels.MethodsThis study undertakes a comparative analysis of three state-of-the-art LLMs: Google Bard, GPT-3.5 Turbo, and GPT-4.0, against four distinct sets of clinical scenarios with increasing complexity. These scenarios cover a range of cardiovascular topics, from prevention to the management of acute illnesses and complex pathologies. The responses generated by the LLMs were assessed for accuracy, understanding of medical terminology, clinical relevance, and appropriateness. The evaluations were conducted by a panel of experienced cardiologists.ResultsAll models showed an understanding of medical terminology, but the application of this knowledge varied. GPT-4.0 outperforms Google Bard and GPT-3.5 Turbo across a spectrum of cardiology-related clinical scenarios, demonstrating a strong understanding of medical terminology, contextual understanding, and most proficiently aligning its responses with current guidelines. Limitations were seen in the models’ abilities to reference ongoing clinical trials.ConclusionLLMs showed promising results in ability to interpret and apply complex clinical guidelines when answering vignette-based clinical queries, with a potential for enhancing patient outcomes through personalized advice. However, they should be utilized with a grain of salt, as supplementary tools in clinical cardiology.
Publisher
Cold Spring Harbor Laboratory
Reference35 articles.
1. ChatGPT: five priorities for research
2. Singhal K , Azizi S , Tu T , Mahdavi SS , Wei J , Chung HW , et al. Large Language Models Encode Clinical Knowledge. 2022;1–44.
3. Improving preeclampsia risk prediction by modeling pregnancy trajectories from routinely collected electronic medical record data
4. ChatGPT: The next frontier in academic writing for cardiologists or a pandora’s box of ethical dilemmas;Eur Hear J Open,2023
5. ChatGPT and the Future of Medical Writing;Radiology,2023