Race, Sex and Age Disparities in the Performance of ECG Deep Learning Models Predicting Heart Failure-Reference-Cited by-同舟云学术

Race, Sex and Age Disparities in the Performance of ECG Deep Learning Models Predicting Heart Failure

Published:2023-05-21 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Kaur Dhamanpreet,Hughes John W.,Rogers Albert J.^ORCID,Kang Guson,Narayan Sanjiv M.^ORCID,Ashley Euan A.^ORCID,Perez Marco V.^ORCID

Abstract

ABSTRACTBackgroundDeep learning models may combat widening racial disparities in heart failure outcomes through early identification of individuals at high risk. However, demographic biases in the performance of these models have not been well studied.MethodsThis retrospective analysis used 12-lead ECGs taken between 2008 - 2018 from 290,252 patients referred for standard clinical indications to Stanford Hospital. The primary model was a convolutional neural network model trained to predict incident heart failure within 5 years. Biases were evaluated on the testing set (160,312 ECGs) using area under the receiver operating curve (AUC), stratified across the protected attributes of race, ethnicity, age, and sex.Results50,956 incident cases of heart failure were observed within 5 years of ECG collection. The performance of the primary model declined with age. There were no significant differences observed between racial groups overall. However, the primary model performed significantly worse in Black patients aged 0 - 40 compared to all other racial groups in this age group, with differences most pronounced among young Black women. Disparities in model performance did not improve with integration of race, ethnicity, gender, and/or age into model architecture, by training separate models for each racial group, nor by providing the model with a dataset of equal racial representation. Using probability thresholds individualized for race, age, and gender offered substantial improvements in F1-scores.ConclusionThe biases found in this study warrant caution against perpetuating disparities through the development of machine learning tools for the prognosis and management of heart failure. Customizing the application of these models by using probability thresholds individualized by race/ethnicity, age, and sex may offer an avenue to mitigate existing algorithmic disparities.

Publisher

Cold Spring Harbor Laboratory

Reference38 articles.

1. Heart Disease and Stroke Statistics—2020 Update: A Report From the American Heart Association

2. Heart failure in primary care: prevalence related to age and comorbidity

3. Epidemiology of heart failure

4. Disparities in Cardiovascular Mortality Related to Heart Failure in the United States

5. Disparity in the Setting of Incident Heart Failure Diagnosis