Author:
Lin Yun C.,Mallia Daniel,Clark-Sevilla Andrea O.,Catto Adam,Leshchenko Alisa,Haas David M.,Wapner Ronald,Pe’er Itsik,Raja Anita,Salleb-Aouissi Ansaf
Abstract
AbstractPreeclampsia is a type of hypertension that develops during pregnancy. It is one of the leading causes for maternal morbidity with consequences during and after pregnancy. Because of its diverse clinical presentation, preeclampsia is a uniquely challenging adverse pregnancy outcome to predict and manage. In this paper, we explore preeclampsia in a nulliparous study cohort with machine learning techniques to build a model that distinguishes between participants most at risk for morbidity, those with preeclampsia with severe features or eclampsia, and the class of no pregnancy-related hypertension. We curated the dataset for this secondary analysis using only training examples that have all known biomarkers, factors, and placental analytes. We built classification models at discrete time points in pregnancy that combine risk factors for preeclampsia with severe features or eclampsia to help screen cases early in pregnancy. The time points are at 60 − 136 (V1), 160 − 216 (V2), 220 − 296 (V3) weeks gestation and at delivery (V4). We then analyzed the model prediction results and provided an interpretable report of cut-off points of the top contributing risk factors and their impact on prediction. Finally, we identified race-based biases in our models and describe how we mitigate those biases. We evaluated the results of four machine learning algorithms and found that ensemble methods outperformed non-ensemble methods. Random Forest models achieved an area under receiver operating characteristic curve at V1 of 0.68 ± 0.05, V2 of 0.73 ± 0.05, V3 of 0.76 ± 0.04 and V4 of 0.83 ± 0.03. Analyzing the Random Forest models, the features found to be most informative across all visits fall into several broad categories: weight, blood pressure measurements, uterine artery doppler measurements, diet intake and serum biomarkers. We found that our models are biased toward non-Hispanic black participants with a high predictive equality ratio of 1.31. We corrected this bias and reduced this ratio to 1.14. We also evaluated results for predictions of early cases versus late preeclampsia with severe features or eclampsia and found that placental analytes as the top contributors in model feature importance. Random Forest for this analysis achieved an area under receiver operating characteristic curve at V1 of 0.63 ± 0.11, V2 of 0.79 ± 0.11, V3 of 0.83 ± 0.08 and V4 of 0.84 ± 0.09. Our experiments suggest that it is important and possible to create screening models to predict the participants at risk of developing preeclampsia with severe features and eclampsia. The top features stress the importance of using several tests, in particular tests for biomarkers and ultrasound measurements. The models could be used as a screening tool as early as 6-13 weeks gestation to help clinicians identify participants who may subsequently develop preeclampsia, confirming the cases they suspect or identifying unsuspected cases. The proposed approach is easily adaptable to address any adverse pregnancy outcome with fairness.
Publisher
Cold Spring Harbor Laboratory
Reference38 articles.
1. Resnik R , Creasy RK , Iams JD , Lockwood CJ , Moore T , Greene MF . Creasy and Resnik’s maternal-Fetal medicine: Principles and practice E-book: Elsevier Health Sciences; 2008.
2. Creasy RK , Resnik R , Iams JD . Maternal Fetal Medicine : Principles and Practice: Fifth edition. Philadelphia : W.B. Saunders Co.; 2004.
3. Poon LC , Nicolaides KH . Early prediction of preeclampsia. Obstetrics and Gynecology International. 2014;2014.
4. Early-and Late-Onset Preeclampsia: A Comprehensive Cohort Study of Laboratory and Clinical Findings according to the New ISHHP Criteria;International Journal of Hypertension,2019
5. Sroka D , Verlohren S. Short Term Prediction of Preeclampsia, 2021.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献