The scientific knowledge of three large language models in cardiology: multiple-choice questions examination-based performance

Author:

Altamimi Ibraheem12,Alhumimidi Abdullah1,Alshehri Salem1,Alrumayan Abdullah3,Al-khlaiwi Thamir4,Meo Sultan A.4,Temsah Mohamad-Hani125

Affiliation:

1. College of Medicine

2. Evidence-Based Health Care and Knowledge Translation Research Chair, Family and Community Medicine Department, College of Medicine, King Saud University

3. College of Medicine, King Saud Bin Abdulaziz University for Health and Sciences, Riyadh, Saudi Arabia

4. Department of Physiology

5. Pediatric Intensive Care Unit, Pediatric Department, College of Medicine, King Saud University Medical City

Abstract

Background: The integration of artificial intelligence (AI) chatbots like Google’s Bard, OpenAI’s ChatGPT, and Microsoft’s Bing Chatbot into academic and professional domains, including cardiology, has been rapidly evolving. Their application in educational and research frameworks, however, raises questions about their efficacy, particularly in specialized fields like cardiology. This study aims to evaluate the knowledge depth and accuracy of these AI chatbots in cardiology using a multiple-choice question (MCQ) format. Methods: The study was conducted as an exploratory, cross-sectional study in November 2023 on a bank of 100 MCQs covering various cardiology topics that was created from authoritative textbooks and question banks. These MCQs were then used to assess the knowledge level of Google’s Bard, Microsoft Bing, and ChatGPT 4.0. Each question was entered manually into the chatbots, ensuring no memory retention bias. Results: The study found that ChatGPT 4.0 demonstrated the highest knowledge score in cardiology, with 87% accuracy, followed by Bing at 60% and Bard at 46%. The performance varied across different cardiology subtopics, with ChatGPT consistently outperforming the others. Notably, the study revealed significant differences in the proficiency of these chatbots in specific cardiology domains. Conclusion: This study highlights a spectrum of efficacy among AI chatbots in disseminating cardiology knowledge. ChatGPT 4.0 emerged as a potential auxiliary educational resource in cardiology, surpassing traditional learning methods in some aspects. However, the variability in performance among these AI systems underscores the need for cautious evaluation and continuous improvement, especially for chatbots like Bard, to ensure reliability and accuracy in medical knowledge dissemination.

Publisher

Ovid Technologies (Wolters Kluwer Health)

Reference22 articles.

1. The AI race is on! Google’s Bard and OpenAI’s ChatGPT head to head: an opinion article;Rahaman;Mizanur and Rahman, Md Nafizur, The AI Race is on,2023

2. Can artificial intelligence help for scientific writing?;Salvagno;Crit Care,2023

3. Could AI help you to write your next paper?;Hutson;Nature,2022

4. Artificial intelligence AI-based Chatbot study of ChatGPT, Google AI Bard and Baidu AI;Ram;World J Adv Engineer Technol Sci,2023

5. Snakebite advice and counseling from artificial intelligence: an acute venomous snakebite consultation with ChatGPT;Altamimi;Cureus,2023

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3