ChatGPT fails challenging the recent ESCMID brain abscess guideline-Reference-Cited by-同舟云学术

ChatGPT fails challenging the recent ESCMID brain abscess guideline

Published:2024-01-27 Issue:4 Volume:271 Page:2086-2101
ISSN:0340-5354
Container-title:Journal of Neurology
language:en
Short-container-title:J Neurol

Author:

Dyckhoff-Shen Susanne^ORCID,Koedel Uwe,Brouwer Matthijs C.,Bodilsen Jacob,Klein Matthias

Abstract

Abstract Background With artificial intelligence (AI) on the rise, it remains unclear if AI is able to professionally evaluate medical research and give scientifically valid recommendations. Aim This study aimed to assess the accuracy of ChatGPT’s responses to ten key questions on brain abscess diagnostics and treatment in comparison to the guideline recently published by the European Society for Clinical Microbiology and Infectious Diseases (ESCMID). Methods All ten PECO (Population, Exposure, Comparator, Outcome) questions which had been developed during the guideline process were presented directly to ChatGPT. Next, ChatGPT was additionally fed with data from studies selected for each PECO question by the ESCMID committee. AI’s responses were subsequently compared with the recommendations of the ESCMID guideline. Results For 17 out of 20 challenges, ChatGPT was able to give recommendations on the management of patients with brain abscess, including grade of evidence and strength of recommendation. Without data prompting, 70% of questions were answered very similar to the guideline recommendation. In the answers that differed from the guideline recommendations, no patient hazard was present. Data input slightly improved the clarity of ChatGPT’s recommendations, but, however, led to less correct answers including two recommendations that directly contradicted the guideline, being associated with the possibility of a hazard to the patient. Conclusion ChatGPT seems to be able to rapidly gather information on brain abscesses and give recommendations on key questions about their management in most cases. Nevertheless, single responses could possibly harm the patients. Thus, the expertise of an expert committee remains inevitable.

Funder

Universitätsklinik München

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s00415-023-12168-1.pdf

Reference15 articles.

1. Bodilsen J, Duerlund LS, Mariager T, Brandt CT, Petersen PT, Larsen L, Hansen BR, Omland LH, Tetens MM, Wiese L et al (2023) Clinical features and prognostic factors in adults with brain abscess. Brain 146(4):1637–1647

2. Bodilsen J, Dalager-Pedersen M, van de Beek D, Brouwer MC, Nielsen H (2020) Incidence and mortality of brain abscess in Denmark: a nationwide population-based study. Clin Microbiol Infect 26(1):95–100

3. Bodilsen J, D’Alessandris QG, Humphreys H, Iro MA, Klein M, Last K, Montesinos IL, Pagliano P, Sipahi OR, San-Juan R et al (2023) European society of Clinical Microbiology and Infectious Diseases guidelines on diagnosis and treatment of brain abscess in children and adults. Clin Microbiol Infect. https://doi.org/10.1016/j.cmi.2023.10.012

4. Holzinger A, Keiblinger K, Holub P, Zatloukal K, Muller H (2023) AI for life: Trends in artificial intelligence for biotechnology. N Biotechnol 74:16–24

5. Cakir H, Caglar U, Yildiz O, Meric A, Ayranci A, Ozgor F (2023) Evaluating the performance of ChatGPT in answering questions related to urolithiasis. Int Urol Nephrol. https://doi.org/10.1016/j.jpurol.2023.08.003

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Besteht ChatGPT die neurologische Facharztprüfung? Eine kritische Betrachtung;psychopraxis. neuropraxis;2024-08-01

2. Protocol For Human Evaluation of Artificial Intelligence Chatbots in Clinical Consultations;2024-03-02