Evaluation and Comparison of Ophthalmic Scientific Abstracts and References by Current Artificial Intelligence Chatbots-Reference-Cited by-同舟云学术

Evaluation and Comparison of Ophthalmic Scientific Abstracts and References by Current Artificial Intelligence Chatbots

Published:2023-09-01 Issue:9 Volume:141 Page:819
ISSN:2168-6165
Container-title:JAMA Ophthalmology
language:en
Short-container-title:JAMA Ophthalmol

Author:

Hua Hong-Uyen¹,Kaakour Abdul-Hadi¹,Rachitskaya Aleksandra¹,Srivastava Sunil¹,Sharma Sumit¹,Mammo Danny A.¹

Affiliation:

1. Cole Eye Institute, Cleveland Clinic Foundation, Cleveland, Ohio

Abstract

ImportanceLanguage-learning model–based artificial intelligence (AI) chatbots are growing in popularity and have significant implications for both patient education and academia. Drawbacks of using AI chatbots in generating scientific abstracts and reference lists, including inaccurate content coming from hallucinations (ie, AI-generated output that deviates from its training data), have not been fully explored.ObjectiveTo evaluate and compare the quality of ophthalmic scientific abstracts and references generated by earlier and updated versions of a popular AI chatbot.Design, Setting, and ParticipantsThis cross-sectional comparative study used 2 versions of an AI chatbot to generate scientific abstracts and 10 references for clinical research questions across 7 ophthalmology subspecialties. The abstracts were graded by 2 authors using modified DISCERN criteria and performance evaluation scores.Main Outcome and MeasuresScores for the chatbot-generated abstracts were compared using the t test. Abstracts were also evaluated by 2 AI output detectors. A hallucination rate for unverifiable references generated by the earlier and updated versions of the chatbot was calculated and compared.ResultsThe mean modified AI-DISCERN scores for the chatbot-generated abstracts were 35.9 and 38.1 (maximum of 50) for the earlier and updated versions, respectively (P = .30). Using the 2 AI output detectors, the mean fake scores (with a score of 100% meaning generated by AI) for the earlier and updated chatbot-generated abstracts were 65.4% and 10.8%, respectively (P = .01), for one detector and were 69.5% and 42.7% (P = .17) for the second detector. The mean hallucination rates for nonverifiable references generated by the earlier and updated versions were 33% and 29% (P = .74).Conclusions and RelevanceBoth versions of the chatbot generated average-quality abstracts. There was a high hallucination rate of generating fake references, and caution should be used when using these AI resources for health education or academic purposes.

Publisher

American Medical Association (AMA)

Subject

Ophthalmology

Link

https://jamanetwork.com/journals/jamaophthalmology/articlepdf/2807442/jamaophthalmology_hua_2023_oi_230040_1694029135.84387.pdf

Reference26 articles.

1. Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models.;Kung;PLOS Digit Health,2023

2. Comparing scientific abstracts generated by ChatGPT to real abstracts with detectors and blinded human reviewers.;Gao;NPJ Digit Med,2023

3. Open artificial intelligence platforms in nursing education: Tools for academic progress or abuse?;O’Connor;Nurse Educ Pract,2023

4. Artificial hallucinations in ChatGPT: implications in scientific writing.;Alkaissi;Cureus,2023

5. DISCERN: an instrument for judging the quality of written consumer health information on treatment choices.;Charnock;J Epidemiol Community Health,1999

Cited by 23 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Evaluation of adherence to STARD for abstracts in a diverse sample of diagnostic accuracy abstracts published in 2012 and 2019 reveals suboptimal reporting practices;Journal of Clinical Epidemiology;2024-09

2. Using Large Language Models to Generate Educational Materials on Childhood Glaucoma;American Journal of Ophthalmology;2024-09

3. Chatbots in neurology and neuroscience: Interactions with students, patients and neurologists;Brain Disorders;2024-09

4. Ethical considerations for large language models in ophthalmology;Current Opinion in Ophthalmology;2024-08-27

5. Large language models: a new frontier in paediatric cataract patient education;British Journal of Ophthalmology;2024-08-22