Training and testing of a gradient boosted machine learning model to predict adverse outcome in patients presenting to emergency departments with suspected covid-19 infection in a middle-income setting-Reference-Cited by-同舟云学术

Training and testing of a gradient boosted machine learning model to predict adverse outcome in patients presenting to emergency departments with suspected covid-19 infection in a middle-income setting

Published:2023-09-20 Issue:9 Volume:2 Page:e0000309
ISSN:2767-3170
Container-title:PLOS Digital Health
language:en
Short-container-title:PLOS Digit Health

Author:

Fuller Gordon Ward^ORCID,Hasan Madina,Hodkinson Peter,McAlpine David,Goodacre Steve,Bath Peter A.,Sbaffi Laura,Omer Yasein,Wallis Lee,Marincowitz Carl

Abstract

COVID-19 infection rates remain high in South Africa. Clinical prediction models may be helpful for rapid triage, and supporting clinical decision making, for patients with suspected COVID-19 infection. The Western Cape, South Africa, has integrated electronic health care data facilitating large-scale linked routine datasets. The aim of this study was to develop a machine learning model to predict adverse outcome in patients presenting with suspected COVID-19 suitable for use in a middle-income setting. A retrospective cohort study was conducted using linked, routine data, from patients presenting with suspected COVID-19 infection to public-sector emergency departments (EDs) in the Western Cape, South Africa between 27th August 2020 and 31st October 2021. The primary outcome was death or critical care admission at 30 days. An XGBoost machine learning model was trained and internally tested using split-sample validation. External validation was performed in 3 test cohorts: Western Cape patients presenting during the Omicron COVID-19 wave, a UK cohort during the ancestral COVID-19 wave, and a Sudanese cohort during ancestral and Eta waves. A total of 282,051 cases were included in a complete case training dataset. The prevalence of 30-day adverse outcome was 4.0%. The most important features for predicting adverse outcome were the requirement for supplemental oxygen, peripheral oxygen saturations, level of consciousness and age. Internal validation using split-sample test data revealed excellent discrimination (C-statistic 0.91, 95% CI 0.90 to 0.91) and calibration (CITL of 1.05). The model achieved C-statistics of 0.84 (95% CI 0.84 to 0.85), 0.72 (95% CI 0.71 to 0.73), and 0.62, (95% CI 0.59 to 0.65) in the Omicron, UK, and Sudanese test cohorts. Results were materially unchanged in sensitivity analyses examining missing data. An XGBoost machine learning model achieved good discrimination and calibration in prediction of adverse outcome in patients presenting with suspected COVID19 to Western Cape EDs. Performance was reduced in temporal and geographical external validation.

Funder

Bill and Melinda Gates Foundation

Publisher

Public Library of Science (PLoS)

Reference42 articles.

1. The COVID-19 pandemic;M Ciotti;Crit Rev Clin Lab Sci,2020

2. Tracking the circulating SARS-CoV-2 variant of concern in South Africa using wastewater-based epidemiology;R Johnson;Sci Rep,2022

3. COVID-19 pandemic dynamics in South Africa and epidemiological characteristics of three variants of concern (Beta, Delta, and Omicron).;W Yang;eLife,2022

4. COVID-19 wave 4 in Western Cape Province, South Africa: Fewer hospitalisations, but new challenges for a depleted workforce;AS Mendelsohn;S Afr Med J,2022

5. Wuhan to World;A Kumar;The COVID-19 Pandemic. Front Cell Infect Microbiol,2021