Author:
St. Pierre Skyler R.,Kaczmarski Bartosz,Peirlinck Mathias,Kuhl Ellen
Abstract
The lack of sex-specific cardiovascular disease criteria contributes to the underdiagnosis of women compared to that of men. For more than half a century, the Framingham Risk Score has been the gold standard to estimate an individual’s risk of developing cardiovascular disease based on the age, sex, cholesterol levels, blood pressure, diabetes status, and the smoking status. Now, machine learning can offer a much more nuanced insight into predicting the risk of cardiovascular diseases. The UK Biobank is a large database that includes traditional risk factors and tests related to the cardiovascular system: magnetic resonance imaging, pulse wave analysis, electrocardiograms, and carotid ultrasounds. Here, we leverage 20,542 datasets from the UK Biobank to build more accurate cardiovascular risk models than the Framingham Risk Score and quantify the underdiagnosis of women compared to that of men. Strikingly, for a first-degree atrioventricular block and dilated cardiomyopathy, two conditions with non-sex-specific diagnostic criteria, our study shows that women are under-diagnosed 2× and 1.4× more than men. Similarly, our results demonstrate the need for sex-specific criteria in essential primary hypertension and hypertrophic cardiomyopathy. Our feature importance analysis reveals that out of the top 10 features across three sexes and four disease categories, traditional Framingham factors made up between 40% and 50%; electrocardiogram, 30%–33%; pulse wave analysis, 13%–23%; and magnetic resonance imaging and carotid ultrasound, 0%–10%. Improving the Framingham Risk Score by leveraging big data and machine learning allows us to incorporate a wider range of biomedical data and prediction features, enhance personalization and accuracy, and continuously integrate new data and knowledge, with the ultimate goal to improve accurate prediction, early detection, and early intervention in cardiovascular disease management. Our analysis pipeline and trained classifiers are freely available at https://github.com/LivingMatterLab/CardiovascularDiseaseClassification.