Machine learning methods to predict attrition in a population-based cohort of very preterm infants

Author:

Teixeira Raquel,Rodrigues Carina,Moreira Carla,Barros Henrique,Camacho Rui

Abstract

AbstractThe timely identification of cohort participants at higher risk for attrition is important to earlier interventions and efficient use of research resources. Machine learning may have advantages over the conventional approaches to improve discrimination by analysing complex interactions among predictors. We developed predictive models of attrition applying a conventional regression model and different machine learning methods. A total of 542 very preterm (< 32 gestational weeks) infants born in Portugal as part of the European Effective Perinatal Intensive Care in Europe (EPICE) cohort were included. We tested a model with a fixed number of predictors (Baseline) and a second with a dynamic number of variables added from each follow-up (Incremental). Eight classification methods were applied: AdaBoost, Artificial Neural Networks, Functional Trees, J48, J48Consolidated, K-Nearest Neighbours, Random Forest and Logistic Regression. Performance was compared using AUC- PR (Area Under the Curve—Precision Recall), Accuracy, Sensitivity and F-measure. Attrition at the four follow-ups were, respectively: 16%, 25%, 13% and 17%. Both models demonstrated good predictive performance, AUC-PR ranging between 69 and 94.1 in Baseline and from 72.5 to 97.1 in Incremental model. Of the whole set of methods, Random Forest presented the best performance at all follow-ups [AUC-PR1: 94.1 (2.0); AUC-PR2: 91.2 (1.2); AUC-PR3: 97.1 (1.0); AUC-PR4: 96.5 (1.7)]. Logistic Regression performed well below Random Forest. The top-ranked predictors were common for both models in all follow-ups: birthweight, gestational age, maternal age, and length of hospital stay. Random Forest presented the highest capacity for prediction and provided interpretable predictors. Researchers involved in cohorts can benefit from our robust models to prepare for and prevent loss to follow-up by directing efforts toward individuals at higher risk.

Funder

Horizon 2020 Framework Programme

Fundação para a Ciência e a Tecnologia

Publisher

Springer Science and Business Media LLC

Subject

Multidisciplinary

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3