A study of generative large language model for medical research and healthcare-Reference-Cited by-同舟云学术

A study of generative large language model for medical research and healthcare

Published:2023-11-16 Issue:1 Volume:6 Page:
ISSN:2398-6352
Container-title:npj Digital Medicine
language:en
Short-container-title:npj Digit. Med.

Author:

Peng Cheng^ORCID,Yang Xi,Chen Aokun,Smith Kaleb E.,PourNejatian Nima,Costa Anthony B.,Martin Cheryl,Flores Mona G.^ORCID,Zhang Ying^ORCID,Magoc Tanja,Lipori Gloria^ORCID,Mitchell Duane A.^ORCID,Ospina Naykky S.,Ahmed Mustafa M.,Hogan William R.^ORCID,Shenkman Elizabeth A.^ORCID,Guo Yi^ORCID,Bian Jiang^ORCID,Wu Yonghui^ORCID

Abstract

AbstractThere are enormous enthusiasm and concerns in applying large language models (LLMs) to healthcare. Yet current assumptions are based on general-purpose LLMs such as ChatGPT, which are not developed for medical use. This study develops a generative clinical LLM, GatorTronGPT, using 277 billion words of text including (1) 82 billion words of clinical text from 126 clinical departments and approximately 2 million patients at the University of Florida Health and (2) 195 billion words of diverse general English text. We train GatorTronGPT using a GPT-3 architecture with up to 20 billion parameters and evaluate its utility for biomedical natural language processing (NLP) and healthcare text generation. GatorTronGPT improves biomedical natural language processing. We apply GatorTronGPT to generate 20 billion words of synthetic text. Synthetic NLP models trained using synthetic text generated by GatorTronGPT outperform models trained using real-world clinical text. Physicians’ Turing test using 1 (worst) to 9 (best) scale shows that there are no significant differences in linguistic readability (p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that physicians cannot differentiate them (p < 0.001). This study provides insights into the opportunities and challenges of LLMs for medical research and healthcare.

Funder

Patient-Centered Outcomes Research Institute

U.S. Department of Health & Human Services | NIH | National Cancer Institute

Publisher

Springer Science and Business Media LLC

Subject

Health Information Management,Health Informatics,Computer Science Applications,Medicine (miscellaneous)

Link

https://www.nature.com/articles/s41746-023-00958-w.pdf

Reference56 articles.

1. Introducing ChatGPT. https://openai.com/blog/chatgpt.

2. Lee, P., Bubeck, S. & Petro, J. Benefits, limits, and risks of GPT-4 as an AI chatbot for medicine. N. Engl. J. Med. 388, 1233–1239 (2023).