ChatGPT and Lacrimal Drainage Disorders: Performance and Scope of Improvement

Author:

Ali Mohammad Javed1

Affiliation:

1. Govindram Seksaria Institute of Dacryology, L.V. Prasad Eye Institute, Hyderabad, India

Abstract

Purpose: This study aimed to report the performance of the large language model ChatGPT (OpenAI, San Francisco, CA, U.S.A.) in the context of lacrimal drainage disorders. Methods: A set of prompts was constructed through questions and statements spanning common and uncommon aspects of lacrimal drainage disorders. Care was taken to avoid constructing prompts that had significant or new knowledge beyond the year 2020. Each of the prompts was presented thrice to ChatGPT. The questions covered common disorders such as primary acquired nasolacrimal duct obstruction and congenital nasolacrimal duct obstruction and their cause and management. The prompts also tested ChatGPT on certain specifics, such as the history of dacryocystorhinostomy (DCR) surgery, lacrimal pump anatomy, and human canalicular surfactants. ChatGPT was also quizzed on controversial topics such as silicone intubation and the use of mitomycin C in DCR surgery. The responses of ChatGPT were carefully analyzed for evidence-based content, specificity of the response, presence of generic text, disclaimers, factual inaccuracies, and its abilities to admit mistakes and challenge incorrect premises. Three lacrimal surgeons graded the responses into three categories: correct, partially correct, and factually incorrect. Results: A total of 21 prompts were presented to the ChatGPT. The responses were detailed and were based according to the prompt structure. In response to most questions, ChatGPT provided a generic disclaimer that it could not give medical advice or professional opinion but then provided an answer to the question in detail. Specific prompts such as “how can I perform an external DCR?” were responded by a sequential listing of all the surgical steps. However, several factual inaccuracies were noted across many ChatGPT replies. Several responses on controversial topics such as silicone intubation and mitomycin C were generic and not precisely evidence-based. ChatGPT’s response to specific questions such as canalicular surfactants and idiopathic canalicular inflammatory disease was poor. The presentation of variable prompts on a single topic led to responses with either repetition or recycling of the phrases. Citations were uniformly missing across all responses. Agreement among the three observers was high (95%) in grading the responses. The responses of ChatGPT were graded as correct for only 40% of the prompts, partially correct in 35%, and outright factually incorrect in 25%. Hence, some degree of factual inaccuracy was present in 60% of the responses, if we consider the partially correct responses. The exciting aspect was that ChatGPT was able to admit mistakes and correct them when presented with counterarguments. It was also capable of challenging incorrect prompts and premises. Conclusion: The performance of ChatGPT in the context of lacrimal drainage disorders, at best, can be termed average. However, the potential of this AI chatbot to influence medicine is enormous. There is a need for it to be specifically trained and retrained for individual medical subspecialties.

Publisher

Ovid Technologies (Wolters Kluwer Health)

Subject

Ophthalmology,General Medicine,Surgery

Reference11 articles.

1. Readership awareness series—paper 4: chatbots and ChatGPT – ethical considerations in scientific publications.;Ali;Semin Ophthalmol,2023

2. Open artificial intelligence platforms in nursing education: tools for academic progress or abuse?;O’Connor;Nurse Educ Pract,2023

3. Rapamycin in the context of Pascal’s Wager: generative pre-trained transformer perspective.;Zhavoronkov;Oncoscience,2022

4. ChatGPT is fun, but not an author.;Thorp;Science,2023

5. Nonhuman “Authors” and implications for the integrity of scientific publication and medical knowledge.;Flanagin;JAMA,2023

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3