ChatGPT Is Moderately Accurate in Providing a General Overview of Orthopaedic Conditions

Author:

Sparks Chandler A.1ORCID,Fasulo Sydney M.2ORCID,Windsor Jordan T.1ORCID,Bankauskas Vita1ORCID,Contrada Edward V.1ORCID,Kraeutler Matthew J.3ORCID,Scillia Anthony J.2ORCID

Affiliation:

1. Hackensack Meridian School of Medicine, Nutley, New Jersey

2. Department of Orthopedic Surgery, St. Joseph’s University Medical Center, Paterson, New Jersey

3. Department of Orthopedics, University of Colorado Anschutz Medical Campus, Aurora, Colorado

Abstract

Background: ChatGPT is an artificial intelligence chatbot capable of providing human-like responses for virtually every possible inquiry. This advancement has provoked public interest regarding the use of ChatGPT, including in health care. The purpose of the present study was to investigate the quantity and accuracy of ChatGPT outputs for general patient-focused inquiries regarding 40 orthopaedic conditions. Methods: For each of the 40 conditions, ChatGPT (GPT-3.5) was prompted with the text “I have been diagnosed with [condition]. Can you tell me more about it?” The numbers of treatment options, risk factors, and symptoms given for each condition were compared with the number in the corresponding American Academy of Orthopaedic Surgeons (AAOS) OrthoInfo website article for information quantity assessment. For accuracy assessment, an attending orthopaedic surgeon ranked the outputs in the categories of <50%, 50% to 74%, 75% to 99%, and 100% accurate. An orthopaedics sports medicine fellow also independently ranked output accuracy. Results: Compared with the AAOS OrthoInfo website, ChatGPT provided significantly fewer treatment options (mean difference, −2.5; p < 0.001) and risk factors (mean difference, −1.1; p = 0.02) but did not differ in the number of symptoms given (mean difference, −0.5; p = 0.31). The surgical treatment options given by ChatGPT were often nondescript (n = 20 outputs), such as “surgery” as the only operative treatment option. Regarding accuracy, most conditions (26 of 40; 65%) were ranked as mostly (75% to 99%) accurate, with the others (14 of 40; 35%) ranked as moderately (50% to 74%) accurate, by an attending surgeon. Neither surgeon ranked any condition as mostly inaccurate (<50% accurate). Interobserver agreement between accuracy ratings was poor (κ = 0.03; p = 0.30). Conclusions: ChatGPT provides at least moderately accurate outputs for general inquiries of orthopaedic conditions but is lacking in the quantity of information it provides for risk factors and treatment options. Professional organizations, such as the AAOS, are the preferred source of musculoskeletal information when compared with ChatGPT. Clinical Relevance: ChatGPT is an emerging technology with potential roles and limitations in patient education that are still being explored.

Publisher

Ovid Technologies (Wolters Kluwer Health)

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. ChatGPT Can Offer At Least Satisfactory Responses to Common Patient Questions Regarding Hip Arthroscopy;Arthroscopy: The Journal of Arthroscopic & Related Surgery;2024-09

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3