Diagnostic Accuracy of a Custom Large Language Model on Rare Pediatric Disease Case Reports-Reference-Cited by-同舟云学术

Diagnostic Accuracy of a Custom Large Language Model on Rare Pediatric Disease Case Reports

Published:2024-09-13 Issue: Volume: Page:
ISSN:1552-4825
Container-title:American Journal of Medical Genetics Part A
language:en
Short-container-title:American J of Med Genetics Pt A

Author:

Young Cameron C.¹²^ORCID,Enichen Ellie¹²,Rivera Christian¹²,Auger Corinne A.¹²,Grant Nathan¹²,Rao Arya¹²,Succi Marc D.²³^ORCID

Affiliation:

1. Harvard Medical School Boston Massachusetts USA

2. Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center, Mass General Brigham Boston Massachusetts USA

3. Department of Radiology Massachusetts General Hospital Boston Massachusetts USA

Abstract

ABSTRACTAccurately diagnosing rare pediatric diseases frequently represent a clinical challenge due to their complex and unusual clinical presentations. Here, we explore the capabilities of three large language models (LLMs), GPT‐4, Gemini Pro, and a custom‐built LLM (GPT‐4 integrated with the Human Phenotype Ontology [GPT‐4 HPO]), by evaluating their diagnostic performance on 61 rare pediatric disease case reports. The performance of the LLMs were assessed for accuracy in identifying specific diagnoses, listing the correct diagnosis among a differential list, and broad disease categories. In addition, GPT‐4 HPO was tested on 100 general pediatrics case reports previously assessed on other LLMs to further validate its performance. The results indicated that GPT‐4 was able to predict the correct diagnosis with a diagnostic accuracy of 13.1%, whereas both GPT‐4 HPO and Gemini Pro had diagnostic accuracies of 8.2%. Further, GPT‐4 HPO showed an improved performance compared with the other two LLMs in identifying the correct diagnosis among its differential list and the broad disease category. Although these findings underscore the potential of LLMs for diagnostic support, particularly when enhanced with domain‐specific ontologies, they also stress the need for further improvement prior to integration into clinical practice.

Publisher

Wiley

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1002/ajmg.a.63878

Reference24 articles.

1. Genetic counselors' utilization of ChatGPT in professional practice: A cross‐sectional study

2. Diagnostic Accuracy of a Large Language Model in Pediatric Case Studies

3. Diagnostic Process in Rare Diseases: Determinants Associated with Diagnostic Delay

4. Cao L. J.Sun andA.Cross.2024.“AutoRD: An Automatic and End‐To‐End System for Rare Disease Knowledge Graph Construction Based on Ontologies‐Enhanced Large Language Models.”arXiv [cs.CL]. arXiv.https://arxiv.org/abs/2403.00953.

5. The future landscape of large language models in medicine