Preeclampsia Predictor with Machine Learning: A Comprehensive and Bias-Free Machine Learning Pipeline

Author:

Lin Yun C.,Mallia Daniel,Clark-Sevilla Andrea O.,Catto Adam,Leshchenko Alisa,Haas David M.,Wapner Ronald,Pe’er Itsik,Raja Anita,Salleb-Aouissi Ansaf

Abstract

AbstractPreeclampsia is a type of hypertension that develops during pregnancy. It is one of the leading causes for maternal morbidity with consequences during and after pregnancy. Because of its diverse clinical presentation, preeclampsia is a uniquely challenging adverse pregnancy outcome to predict and manage. In this paper, we explore preeclampsia in a nulliparous study cohort with machine learning techniques to build a model that distinguishes between participants most at risk for morbidity, those with preeclampsia with severe features or eclampsia, and the class of no pregnancy-related hypertension. We curated the dataset for this secondary analysis using only training examples that have all known biomarkers, factors, and placental analytes. We built classification models at discrete time points in pregnancy that combine risk factors for preeclampsia with severe features or eclampsia to help screen cases early in pregnancy. The time points are at 60 − 136 (V1), 160 − 216 (V2), 220 − 296 (V3) weeks gestation and at delivery (V4). We then analyzed the model prediction results and provided an interpretable report of cut-off points of the top contributing risk factors and their impact on prediction. Finally, we identified race-based biases in our models and describe how we mitigate those biases. We evaluated the results of four machine learning algorithms and found that ensemble methods outperformed non-ensemble methods. Random Forest models achieved an area under receiver operating characteristic curve at V1 of 0.68 ± 0.05, V2 of 0.73 ± 0.05, V3 of 0.76 ± 0.04 and V4 of 0.83 ± 0.03. Analyzing the Random Forest models, the features found to be most informative across all visits fall into several broad categories: weight, blood pressure measurements, uterine artery doppler measurements, diet intake and serum biomarkers. We found that our models are biased toward non-Hispanic black participants with a high predictive equality ratio of 1.31. We corrected this bias and reduced this ratio to 1.14. We also evaluated results for predictions of early cases versus late preeclampsia with severe features or eclampsia and found that placental analytes as the top contributors in model feature importance. Random Forest for this analysis achieved an area under receiver operating characteristic curve at V1 of 0.63 ± 0.11, V2 of 0.79 ± 0.11, V3 of 0.83 ± 0.08 and V4 of 0.84 ± 0.09. Our experiments suggest that it is important and possible to create screening models to predict the participants at risk of developing preeclampsia with severe features and eclampsia. The top features stress the importance of using several tests, in particular tests for biomarkers and ultrasound measurements. The models could be used as a screening tool as early as 6-13 weeks gestation to help clinicians identify participants who may subsequently develop preeclampsia, confirming the cases they suspect or identifying unsuspected cases. The proposed approach is easily adaptable to address any adverse pregnancy outcome with fairness.

Publisher

Cold Spring Harbor Laboratory

Reference38 articles.

1. Resnik R , Creasy RK , Iams JD , Lockwood CJ , Moore T , Greene MF . Creasy and Resnik’s maternal-Fetal medicine: Principles and practice E-book: Elsevier Health Sciences; 2008.

2. Creasy RK , Resnik R , Iams JD . Maternal Fetal Medicine : Principles and Practice: Fifth edition. Philadelphia : W.B. Saunders Co.; 2004.

3. Poon LC , Nicolaides KH . Early prediction of preeclampsia. Obstetrics and Gynecology International. 2014;2014.

4. Early-and Late-Onset Preeclampsia: A Comprehensive Cohort Study of Laboratory and Clinical Findings according to the New ISHHP Criteria;International Journal of Hypertension,2019

5. Sroka D , Verlohren S. Short Term Prediction of Preeclampsia, 2021.

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3