Machine learning outperformed logistic regression classification even with limit sample size: A model to predict pediatric HIV mortality and clinical progression to AIDS

Author:

Domínguez-Rodríguez SaraORCID,Serna-Pascual Miquel,Oletto AndreaORCID,Barnabas Shaun,Zuidewind Peter,Dobbels Els,Danaviah Siva,Behuhuma Osee,Lain Maria Grazia,Vaz Paula,Fernández-Luis Sheila,Nhampossa Tacilta,Lopez-Varela ElisaORCID,Otwombe Kennedy,Liberty Afaaf,Violari Avy,Maiga Almoustapha Issiaka,Rossi Paolo,Giaquinto Carlo,Kuhn Louise,Rojo Pablo,Tagarro AlfredoORCID,

Abstract

Logistic regression (LR) is the most common prediction model in medicine. In recent years, supervised machine learning (ML) methods have gained popularity. However, there are many concerns about ML utility for small sample sizes. In this study, we aim to compare the performance of 7 algorithms in the prediction of 1-year mortality and clinical progression to AIDS in a small cohort of infants living with HIV from South Africa and Mozambique. The data set (n = 100) was randomly split into 70% training and 30% validation set. Seven algorithms (LR, Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Naïve Bayes (NB), Artificial Neural Network (ANN), and Elastic Net) were compared. The variables included as predictors were the same across the models including sociodemographic, virologic, immunologic, and maternal status features. For each of the models, a parameter tuning was performed to select the best-performing hyperparameters using 5 times repeated 10-fold cross-validation. A confusion-matrix was built to assess their accuracy, sensitivity, and specificity. RF ranked as the best algorithm in terms of accuracy (82,8%), sensitivity (78%), and AUC (0,73). Regarding specificity and sensitivity, RF showed better performance than the other algorithms in the external validation and the highest AUC. LR showed lower performance compared with RF, SVM, or KNN. The outcome of children living with perinatally acquired HIV can be predicted with considerable accuracy using ML algorithms. Better models would benefit less specialized staff in limited resources countries to improve prompt referral in case of high-risk clinical progression.

Funder

ViiV Healthcare

Publisher

Public Library of Science (PLoS)

Subject

Multidisciplinary

Reference30 articles.

1. Mortality among pediatric patients on HIV treatment in sub-Saharan African countries: A systematic review and meta-analysis;I Ahmed;BMC Public Health. BioMed Central Ltd.,2019

2. Mortality and clinical outcomes in HIV-infected children on antiretroviral therapy in Malawi, Lesotho, and;MM Kabue;Swaziland. Pediatrics,2012

3. Risk factors associated with increased mortality among HIV infected children initiating antiretroviral therapy (ART) in South Africa;BC Zanoni;PLoS One,2011

4. Mortality in a Cohort of HIV-Infected Children: A 12-Month Outcome of Antiretroviral Therapy in Makurdi, Nigeria;EA Anigilaje;Adv Med,2018

5. Predicting the future-big data, machine learning, and clinical medicine;Z Obermeyer;New England Journal of Medicine. Massachussetts Medical Society,2016

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3