BACKGROUND
Electronic Health Records (EHRs) and routine documentation practices are crucial for providing comprehensive health records, diagnoses, and treatments for patients' daily care. However, the complexity and verbosity of EHR narratives can overload healthcare providers and risk diagnostic inaccuracies.
OBJECTIVE
This study aims to enhance the proficiency of Large Language Models (LLMs) with a medical Knowledge Graph in automated diagnosis generation by minimizing diagnostic errors and preventing patient harm.
METHODS
We introduced an innovative approach that integrates a medical knowledge graph (KG) and a novel graph model, Dr.KNOWs, inspired by clinical diagnostic reasoning processes. Our approach utilized the National Library of Medicine's Unified Medical Language System (UMLS) to derive a KG, a robust repository of biomedical knowledge. This method eschews the need for pre-training, leveraging the KG as an auxiliary tool for interpreting and summarizing complex medical concepts. We evaluated our model's performance for intrinsic evaluation of predicting the correct concepts for diagnoses, and extrinsic evaluation of enhancing language models in diagnosis prediction task. We also conducted human evaluation to score the “Reasoning” section generated by language models for explanability.
RESULTS
Our proposed knowledge graph model significantly surpassed traditional concept extractors in identifying accurate diagnosis concepts, with a concept-based F-score of 25.20 (95% CI: 23.93-26.98) compared to the extractor's 21.13 (95% CI: 19.85-22.41). In a diagnosis prediction shared task dataset, ChatGPT with predicted paths input achieved a ROUGE score of 25.43 (95% CI: 23.53-25.35), outperforming its no-path version, which scored 21.23 (95% CI: 19.58-21.72). The open-box T5 model attained a ROUGE score of 30.72, ranking third on the task's current leaderboard. Human evaluations revealed that models with DR.KNOWs predicted paths aligned more closely with human reasoning, showing a significant improvement over models without paths (P<0.01).
CONCLUSIONS
This study underscores the potential of integrating medical KGs with LLMs to refine AI-driven diagnostic processes, highlighting the significance of external knowledge sources in creating explainable diagnostic pathways and advancing towards AI-enhanced diagnostic decision support systems.