Systematic analysis of ChatGPT, Google search and Llama 2 for clinical decision support tasks-Reference-Cited by-同舟云学术

Systematic analysis of ChatGPT, Google search and Llama 2 for clinical decision support tasks

Published:2024-03-06 Issue:1 Volume:15 Page:
ISSN:2041-1723
Container-title:Nature Communications
language:en
Short-container-title:Nat Commun

Author:

Sandmann Sarah^ORCID,Riepenhausen Sarah^ORCID,Plagwitz Lucas^ORCID,Varghese Julian^ORCID

Abstract

AbstractIt is likely that individuals are turning to Large Language Models (LLMs) to seek health advice, much like searching for diagnoses on Google. We evaluate clinical accuracy of GPT-3·5 and GPT-4 for suggesting initial diagnosis, examination steps and treatment of 110 medical cases across diverse clinical disciplines. Moreover, two model configurations of the Llama 2 open source LLMs are assessed in a sub-study. For benchmarking the diagnostic task, we conduct a naïve Google search for comparison. Overall, GPT-4 performed best with superior performances over GPT-3·5 considering diagnosis and examination and superior performance over Google for diagnosis. Except for treatment, better performance on frequent vs rare diseases is evident for all three approaches. The sub-study indicates slightly lower performances for Llama models. In conclusion, the commercial LLMs show growing potential for medical question answering in two successive major releases. However, some weaknesses underscore the need for robust and regulated AI models in health care. Open source LLMs can be a viable option to address specific needs regarding data privacy and transparency of training.

Publisher

Springer Science and Business Media LLC

Link

https://www.nature.com/articles/s41467-024-46411-8.pdf

Reference27 articles.

1. Varghese, J., Chapiro, J. ChatGPT: The transformative influence of generative AI on science and healthcare. J. Hepatol. 2023 [cited 2023 Sep 7]; Available from: https://www.sciencedirect.com/science/article/pii/S0168827823050390.

2. Deng, J. & Lin, Y. The Benefits and Challenges of ChatGPT: An Overview. Front. Comput. Intell. Syst. 2, 81–83 (2022).

3. Surameery, N.M.S., Shakor, M.Y. Use Chat GPT to Solve Programming Bugs. Int. J. Info. Technol. Comput. Eng. (IJITC) ISSN: 2455–5290. 2023;3(01):17–22.

4. Zheng, H. & Zhan, H. ChatGPT in Scientific Writing: A Cautionary Tale. Am. J. Med. 136, 725–726.e6 (2023).

5. Yang H. How I use ChatGPT responsibly in my teaching. Nature. 2023 [cited 2023 Apr 16]; Available from: https://www.nature.com/articles/d41586-023-01026-9.

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Performance of Open-Source LLMs in Challenging Radiological Cases – A Benchmark Study on 4,049 Eurorad Case Reports;2024-09-06

2. A future role for health applications of large language models depends on regulators enforcing safety standards;The Lancet Digital Health;2024-09

3. Large Language Models to Help Appeal Denied Radiotherapy Services;JCO Clinical Cancer Informatics;2024-09

4. The Combined Use of GIS and Generative Artificial Intelligence in Detecting Potential Geodiversity Sites and Promoting Geoheritage;Resources;2024-08-27

5. Performance of a Novel Medical Artificial Intelligence Large Model (MedGo) on Supporting Decision-Making for Emergency Patients with Suspected Sepsis (Preprint);2024-08-15