Abstract
AbstractMultimorbidity, the coexistence of multiple health conditions in individuals, is prevalent and increasing worldwide, proving to be a growing challenge for patients and the healthcare systems. Furthermore, the prevalence of multimorbidity contributes to an increased risk of hospital admission or even death. In this study, we employ a principled approach that utilises longitudinal data routinely collected in electronic health records linked to half a million people from the UK biobank to generate digital comorbidity fingerprints (DCFs) using a topic modelling approach, Latent Dirichlet Allocation. These comorbidity fingerprints summarise a patient’s full secondary care clinical history, i.e. their comorbidities and past interventions. We identified 18 clinically relevant DCFs, which captured nuanced combinations of diseases and risk factors, e.g. grouping cardiovascular disorders with common risk factors but also novel groupings that are not obvious and differ in both their breadth and depth from existing observational disease associations. The DCFs, combined with demographic characteristics, performed on par or outperformed traditional models of all-cause mortality or hospital admission, showcasing the potential of data-driven strategies in healthcare forecasting. The comorbidity fingerprints together with age and number of hospital admissions were shown to be the most important factors in the predictions. Additionally, our DCF approach showed robust and consistent performance over time. Our findings underscore the promising role of interpretable data-driven approaches in healthcare forecasting, suggesting improved risk profiling for individual clinical decisions and targeted public health interventions, with consistent and robust performance over time.Author summaryThis study addresses the global challenge of multimorbidity, the presence of multiple health conditions in individuals, which is on the rise and poses a significant burden on patients and healthcare systems. Investigating its impact on the risk of hospitalization or mortality, we employ a sophisticated approach using longitudinal data from the UK Biobank to create digital comorbidity fingerprints (DCFs) through natural language processing methods. These DCFs, summarizing a patient’s complete clinical history, reveal 18 clinically relevant patterns, including unique combinations of diseases and risk factors. When combined with patient demographic and lifestyle data, the DCF approach performs similarly to traditional models in predicting all-cause mortality or hospitalization. Notably, the DCF approach demonstrates robust and consistent performance over time, highlighting its potential for enhancing healthcare forecasting. These findings emphasize the value of interpretable data-driven strategies in healthcare, offering improved risk profiling for individual clinical decisions and targeted public health interventions with enduring reliability.
Publisher
Cold Spring Harbor Laboratory