Author:
Nghiem Nhung,Wilson Nick,Krebs Jeremy,Tran Truyen
Abstract
ABSTRACTBackgroundIn the age of big data, linked social and administrative health data in combination with machine learning (ML) is being increasingly used to improve prediction in cardiovascular diseases (CVD). We aimed to apply ML methods on extensive national-level health and social administrative datasets to predict future diabetes complications by ethnicity.MethodsFive ML models were used to predict CVD events among all people with known diabetes in the population of New Zealand, utilizing national-level administrative data at the individual level.ResultsThe Xgboost ML model had the best predictive power for predicting CVD events three years into the future among the population with diabetes. The optimization procedure also found limited improvement in AUC by ethnicity. The results indicated no trade-off between model predictive performance and equity gap of prediction by ethnicity. The list of variables of importance was different among different models/ethnic groups, for examples: age, deprivation, having had a hospitalization event, and the number of years living with diabetes.Discussion and conclusionsWe provide further evidence that ML with administrative health data can be used for meaningful future prediction of health outcomes. As such it could be utilized to inform health planning and healthcare resource allocation for diabetes management and the prevention of CVD events. Our results may suggest limited scope for developing prediction models by ethnic group and that the major ways to reduce inequitable health outcomes is probably via improved delivery of prevention and management to those groups with diabetes at highest need.
Publisher
Cold Spring Harbor Laboratory
Reference33 articles.
1. Cardiovascular risk prediction in type 2 diabetes before and after widespread screening: a derivation and validation study;The Lancet,2021
2. Global, regional, and national comparative risk assessment of 84 behavioural, environmental and occupational, and metabolic risks or clusters of risks for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017;The Lancet,2018
3. Health system costs for individual and comorbid noncommunicable diseases: An analysis of publicly funded health events from New Zealand;PLoS Medicine,2019
4. Predicting high health-cost users among people with cardiovascular disease using machine learning and nationwide linked social administrative datasets;Health Economics Review,2023
5. Ministry of Health. Diabetes – Māori health statistics, https://www.health.govt.nz/our-work/populations/maori-health/tatau-kahukura-maori-health-statistics/nga-mana-hauora-tutohu-health-status-indicators/diabetes 2015