Evaluation of Vertigo-Related Information from Artificial Intelligence Chatbot-Reference-Cited by-同舟云学术

Evaluation of Vertigo-Related Information from Artificial Intelligence Chatbot

Published:2024-09-02 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Liu Xu¹,Shi Suming¹,Zhang Xin¹,Gao Qianwen¹,Wang Wuqing¹

Affiliation:

1. Fudan University

Abstract

Objective: To compare the diagnostic accuracy of an artificial intelligence chatbot and clinical experts in managing vertigo-related diseases and evaluate the ability of the AI chatbot to address vertigo-related issues. Methods: 20 clinical questions about vertigo were input into ChatGPT-4o, and three otologists evaluated the responses using a 5-point Likert scale for accuracy, comprehensiveness, clarity, practicality, and credibility. Readability was assessed using Flesch Reading Ease and Flesch-Kincaid Grade Level formulas. The model and two otologists diagnosed 15 outpatient vertigo cases, and their diagnostic accuracy was calculated. Statistical analysis used ANOVA and paired t-tests. Results: ChatGPT-4o scored highest in credibility (4.78). Repeated Measures ANOVA showed significant differences across dimensions (F=2.682, p=0.038). Readability analysis revealed higher difficulty in diagnostic texts. The model's diagnostic accuracy was comparable to a clinician with one year of experience but inferior to a clinician with five years of experience (p=0.04). Conclusion: ChatGPT-4o shows promise as a supplementary tool for managing vertigo but requires improvements in readability and diagnostic capabilities.

Publisher

Springer Science and Business Media LLC

Reference22 articles.

1. Role of Chat GPT in Public Health;Biswas SS;Ann Biomed Eng. May,2023

2. Opportunities, challenges, and future directions of large language models, including ChatGPT in medical education: a systematic scoping review;Xu X;J Educ Eval Health Prof,2024

3. Performance and Consistency of ChatGPT-4 Versus Otologists: A Clinical Case Series;Lechien JR;Otolaryngol Head Neck Surg. Jun,2024

4. Reliability of large language models in managing odontogenic sinusitis clinical scenarios: a preliminary multidisciplinary evaluation;Saibene AM;Eur Arch Otorhinolaryngol. Apr,2024

5. Reliability of large language models for advanced head and neck malignancies management: a comparison between ChatGPT 4 and Gemini Advanced;Lorenzi A;Eur Arch Otorhinolaryngol. May,2024