Performance of ChatGPT in Diagnosis of Corneal Eye Diseases-Reference-Cited by-同舟云学术

Performance of ChatGPT in Diagnosis of Corneal Eye Diseases

Published:2024-02-23 Issue:5 Volume:43 Page:664-670
ISSN:0277-3740
Container-title:Cornea
language:en
Short-container-title:

Author:

Delsoz Mohammad¹^ORCID,Madadi Yeganeh¹,Raja Hina¹,Munir Wuqaas M.²,Tamm Brendan²,Mehravaran Shiva³,Soleimani Mohammad⁴⁵,Djalilian Ali⁴,Yousefi Siamak¹⁶^ORCID

Affiliation:

1. Department of Ophthalmology, Hamilton Eye Institute, University of Tennessee Health Science Center, Memphis, TN;

2. Department of Ophthalmology and Visual Sciences, University of Maryland School of Medicine, Baltimore, MD;

3. Department of Biology, School of Computer, Mathematical, and Natural Sciences, Morgan State University, Baltimore, MD;

4. Department of Ophthalmology and Visual Sciences, University of Illinois at Chicago, Chicago, IL;

5. Eye Research Center, Farabi Eye Hospital, Tehran University of Medical Sciences, Tehran, Iran; and

6. Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN.

Abstract

Purpose: The aim of this study was to assess the capabilities of ChatGPT-4.0 and ChatGPT-3.5 for diagnosing corneal eye diseases based on case reports and compare with human experts. Methods: We randomly selected 20 cases of corneal diseases including corneal infections, dystrophies, and degenerations from a publicly accessible online database from the University of Iowa. We then input the text of each case description into ChatGPT-4.0 and ChatGPT-3.5 and asked for a provisional diagnosis. We finally evaluated the responses based on the correct diagnoses, compared them with the diagnoses made by 3 corneal specialists (human experts), and evaluated interobserver agreements. Results: The provisional diagnosis accuracy based on ChatGPT-4.0 was 85% (17 correct of 20 cases), whereas the accuracy of ChatGPT-3.5 was 60% (12 correct cases of 20). The accuracy of 3 corneal specialists compared with ChatGPT-4.0 and ChatGPT-3.5 was 100% (20 cases, P = 0.23, P = 0.0033), 90% (18 cases, P = 0.99, P = 0.6), and 90% (18 cases, P = 0.99, P = 0.6), respectively. The interobserver agreement between ChatGPT-4.0 and ChatGPT-3.5 was 65% (13 cases), whereas the interobserver agreement between ChatGPT-4.0 and 3 corneal specialists was 85% (17 cases), 80% (16 cases), and 75% (15 cases), respectively. However, the interobserver agreement between ChatGPT-3.5 and each of 3 corneal specialists was 60% (12 cases). Conclusions: The accuracy of ChatGPT-4.0 in diagnosing patients with various corneal conditions was markedly improved than ChatGPT-3.5 and promising for potential clinical integration. A balanced approach that combines artificial intelligence–generated insights with clinical expertise holds a key role for unveiling its full potential in eye care.

Publisher

Ovid Technologies (Wolters Kluwer Health)

Reference30 articles.

1. Corneal innervation and sensation: the eye and beyond;Yang;Yale J Biol Med,2018

2. Improving access to eye care: a systematic review of the literature;Solomon;Ophthalmology,2022

3. Ophthalmology training and competency levels in care of patients with ophthalmic complaints in United States internal medicine, emergency medicine and family medicine residents;Gelston;J Educ Eval Health Prof,2019

4. Application of artificial intelligence in medicine: an overview;Liu;Curr Med Sci,2021

5. Artificial intelligence for anterior segment diseases: emerging applications in ophthalmology;Ting;Br J Ophthalmol,2021

Cited by 13 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Reply to Comment on: Predicting Glaucoma Before Onset Using a Large Language Model Chatbot;American Journal of Ophthalmology;2024-10

2. Predicting Glaucoma Before Onset Using a Large Language Model Chatbot;American Journal of Ophthalmology;2024-10

3. The Diagnostic Ability of GPT-3.5 and GPT-4.0 in Surgery: Comparative Analysis;Journal of Medical Internet Research;2024-09-10

4. Evaluating large language models on medical, lay-language, and self-reported descriptions of genetic conditions;The American Journal of Human Genetics;2024-09

5. Artificial intelligence applications in cataract and refractive surgeries;Current Opinion in Ophthalmology;2024-08-28