Evidence-based potential of generative artificial intelligence large language models in orthodontics: a comparative study of ChatGPT, Google Bard, and Microsoft Bing

Author:

Makrygiannakis Miltiadis A12ORCID,Giannakopoulos Kostis2,Kaklamanos Eleftherios G234ORCID

Affiliation:

1. School of Dentistry, National and Kapodistrian University of Athens , Athens 11527 , Greece

2. School of Dentistry, European University Cyprus , Nicosia 2404 , Cyprus

3. School of Dentistry, Aristotle University of Thessaloniki , Thessaloniki 54124 , Greece

4. Hamdan bin Mohammed College of Dental Medicine, Mohammed bin Rashid University of Medicine and Health Sciences (MBRU) , Dubai 505055 , United Arab Emirates

Abstract

Summary Background The increasing utilization of large language models (LLMs) in Generative Artificial Intelligence across various medical and dental fields, and specifically orthodontics, raises questions about their accuracy. Objective This study aimed to assess and compare the answers offered by four LLMs: Google’s Bard, OpenAI’s ChatGPT-3.5, and ChatGPT-4, and Microsoft’s Bing, in response to clinically relevant questions within the field of orthodontics. Materials and methods Ten open-type clinical orthodontics-related questions were posed to the LLMs. The responses provided by the LLMs were assessed on a scale ranging from 0 (minimum) to 10 (maximum) points, benchmarked against robust scientific evidence, including consensus statements and systematic reviews, using a predefined rubric. After a 4-week interval from the initial evaluation, the answers were reevaluated to gauge intra-evaluator reliability. Statistical comparisons were conducted on the scores using Friedman’s and Wilcoxon’s tests to identify the model providing the answers with the most comprehensiveness, scientific accuracy, clarity, and relevance. Results Overall, no statistically significant differences between the scores given by the two evaluators, on both scoring occasions, were detected, so an average score for every LLM was computed. The LLM answers scoring the highest, were those of Microsoft Bing Chat (average score = 7.1), followed by ChatGPT 4 (average score = 4.7), Google Bard (average score = 4.6), and finally ChatGPT 3.5 (average score 3.8). While Microsoft Bing Chat statistically outperformed ChatGPT-3.5 (P-value = 0.017) and Google Bard (P-value = 0.029), as well, and Chat GPT-4 outperformed Chat GPT-3.5 (P-value = 0.011), all models occasionally produced answers with a lack of comprehensiveness, scientific accuracy, clarity, and relevance. Limitations The questions asked were indicative and did not cover the entire field of orthodontics. Conclusions Language models (LLMs) show great potential in supporting evidence-based orthodontics. However, their current limitations pose a potential risk of making incorrect healthcare decisions if utilized without careful consideration. Consequently, these tools cannot serve as a substitute for the orthodontist’s essential critical thinking and comprehensive subject knowledge. For effective integration into practice, further research, clinical validation, and enhancements to the models are essential. Clinicians must be mindful of the limitations of LLMs, as their imprudent utilization could have adverse effects on patient care.

Publisher

Oxford University Press (OUP)

Reference33 articles.

1. ChatGPT: chances and challenges for dentistry;Eggmann,2023

2. Applications of artificial intelligence in dentistry: a comprehensive review;Carrillo-Perez,2022

3. The use and performance of artificial intelligence applications in dental and maxillofacial radiology: a systematic review;Hung,2020

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3