Affiliation:
1. Michael G. DeGroote School of Medicine, McMaster University, Hamilton, ON, Canada
2. Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
3. Department of Diagnostic Imaging, Hamilton Health Sciences, Juravinski Hospital and Cancer Centre, Hamilton, ON, Canada
4. Department of Radiology, McMaster University, Hamilton, ON, Canada
Abstract
Purpose Bard by Google, a direct competitor to ChatGPT, was recently released. Understanding the relative performance of these different chatbots can provide important insight into their strengths and weaknesses as well as which roles they are most suited to fill. In this project, we aimed to compare the most recent version of ChatGPT, ChatGPT-4, and Bard by Google, in their ability to accurately respond to radiology board examination practice questions. Methods Text-based questions were collected from the 2017-2021 American College of Radiology’s Diagnostic Radiology In-Training (DXIT) examinations. ChatGPT-4 and Bard were queried, and their comparative accuracies, response lengths, and response times were documented. Subspecialty-specific performance was analyzed as well. Results 318 questions were included in our analysis. ChatGPT answered significantly more accurately than Bard (87.11% vs 70.44%, P < .0001). ChatGPT’s response length was significantly shorter than Bard’s (935.28 ± 440.88 characters vs 1437.52 ± 415.91 characters, P < .0001). ChatGPT’s response time was significantly longer than Bard’s (26.79 ± 3.27 seconds vs 7.55 ± 1.88 seconds, P < .0001). ChatGPT performed superiorly to Bard in neuroradiology, (100.00% vs 86.21%, P = .03), general & physics (85.39% vs 68.54%, P < .001), nuclear medicine (80.00% vs 56.67%, P < .01), pediatric radiology (93.75% vs 68.75%, P = .03), and ultrasound (100.00% vs 63.64%, P < .001). In the remaining subspecialties, there were no significant differences between ChatGPT and Bard’s performance. Conclusion ChatGPT displayed superior radiology knowledge compared to Bard. While both chatbots display reasonable radiology knowledge, they should be used with conscious knowledge of their limitations and fallibility. Both chatbots provided incorrect or illogical answer explanations and did not always address the educational content of the question.
Subject
Radiology, Nuclear Medicine and imaging,General Medicine
Reference29 articles.
1. ChatGPT OpenAI; 2023. Accessed June 14, 2023. https://chat.openai.com/
2. MeetGoogle Bard. 2023. Accessed June 14, 2023. https://bard.google.com/
3. LaMDA. Our breakthrough conversation technology. Meta; 2023. Accessed June 14, 2023. https://blog.google/technology/ai/lamda/
4. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT
5. Potential Use Cases for ChatGPT in Radiology Reporting
Cited by
44 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献