Leveraging A Medical Knowledge Graph into Large Language Models for Diagnosis Prediction (Preprint)-Reference-Cited by-同舟云学术

Leveraging A Medical Knowledge Graph into Large Language Models for Diagnosis Prediction (Preprint)

Published:2024-03-21 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Gao Yanjun^ORCID,Li Ruizhe,Croxford Emma,Caskey John R,Patterson Brian W^ORCID,Churpek Matthew M.^ORCID,Miller Timothy,Dligach Dmitriy,Afshar Majid

Abstract

BACKGROUND

Electronic Health Records (EHRs) and routine documentation practices are crucial for providing comprehensive health records, diagnoses, and treatments for patients' daily care. However, the complexity and verbosity of EHR narratives can overload healthcare providers and risk diagnostic inaccuracies.

OBJECTIVE

This study aims to enhance the proficiency of Large Language Models (LLMs) with a medical Knowledge Graph in automated diagnosis generation by minimizing diagnostic errors and preventing patient harm.

METHODS

We introduced an innovative approach that integrates a medical knowledge graph (KG) and a novel graph model, Dr.KNOWs, inspired by clinical diagnostic reasoning processes. Our approach utilized the National Library of Medicine's Unified Medical Language System (UMLS) to derive a KG, a robust repository of biomedical knowledge. This method eschews the need for pre-training, leveraging the KG as an auxiliary tool for interpreting and summarizing complex medical concepts. We evaluated our model's performance for intrinsic evaluation of predicting the correct concepts for diagnoses, and extrinsic evaluation of enhancing language models in diagnosis prediction task. We also conducted human evaluation to score the “Reasoning” section generated by language models for explanability.

RESULTS

Our proposed knowledge graph model significantly surpassed traditional concept extractors in identifying accurate diagnosis concepts, with a concept-based F-score of 25.20 (95% CI: 23.93-26.98) compared to the extractor's 21.13 (95% CI: 19.85-22.41). In a diagnosis prediction shared task dataset, ChatGPT with predicted paths input achieved a ROUGE score of 25.43 (95% CI: 23.53-25.35), outperforming its no-path version, which scored 21.23 (95% CI: 19.58-21.72). The open-box T5 model attained a ROUGE score of 30.72, ranking third on the task's current leaderboard. Human evaluations revealed that models with DR.KNOWs predicted paths aligned more closely with human reasoning, showing a significant improvement over models without paths (P<0.01).

CONCLUSIONS

This study underscores the potential of integrating medical KGs with LLMs to refine AI-driven diagnostic processes, highlighting the significance of external knowledge sources in creating explainable diagnostic pathways and advancing towards AI-enhanced diagnostic decision support systems.

Publisher

JMIR Publications Inc.

Reference34 articles.

1. What Do Physicians Read (and Ignore) in Electronic Progress Notes?

2. Length and Redundancy of Outpatient Progress Notes Across a Decade at an Academic Medical Center

3. “Note Bloat” impacts deep learning-based NLP models for clinical prediction tasks

4. Patient Safety Issues From Information Overload in Electronic Medical Records

5. Information overload and unsustainable workloads in the era of electronic health records

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Large Language Models in Healthcare and Medical Domain: A Review;Informatics;2024-08-07

2. The new paradigm in machine learning – foundation models, large language models and beyond: a primer for physicians;Internal Medicine Journal;2024-05

3. MedSyn: LLM-Based Synthetic Medical Text Generation Framework;Lecture Notes in Computer Science;2024