Author:
Hien Nguyen Thi Kim,Tsai Feng-Jen,Chang Yu-Hui,Burton Whitney,Phuc Phan Thanh,Nguyen Phung-Anh,Harnod Dorji,Lam Carlos Shu-Kei,Lu Tsung-Chien,Chen Chang-I,Hsu Min-Huei,Lu Christine Y.,Huang Chih-Wei,Yang Hsuan-Chia,Hsu Jason C.
Abstract
BackgroundPrevious studies have identified COVID-19 risk factors, such as age and chronic health conditions, linked to severe outcomes and mortality. However, accurately predicting severe illness in COVID-19 patients remains challenging, lacking precise methods.ObjectiveThis study aimed to leverage clinical real-world data and multiple machine-learning algorithms to formulate innovative predictive models for assessing the risk of severe outcomes or mortality in hospitalized patients with COVID-19.MethodsData were obtained from the Taipei Medical University Clinical Research Database (TMUCRD) including electronic health records from three Taiwanese hospitals in Taiwan. This study included patients admitted to the hospitals who received an initial diagnosis of COVID-19 between January 1, 2021, and May 31, 2022. The primary outcome was defined as the composite of severe infection, including ventilator use, intubation, ICU admission, and mortality. Secondary outcomes consisted of individual indicators. The dataset encompassed demographic data, health status, COVID-19 specifics, comorbidities, medications, and laboratory results. Two modes (full mode and simplified mode) are used; the former includes all features, and the latter only includes the 30 most important features selected based on the algorithm used by the best model in full mode. Seven machine learning was employed algorithms the performance of the models was evaluated using metrics such as the area under the receiver operating characteristic curve (AUROC), accuracy, sensitivity, and specificity.ResultsThe study encompassed 22,192 eligible in-patients diagnosed with COVID-19. In the full mode, the model using the light gradient boosting machine algorithm achieved the highest AUROC value (0.939), with an accuracy of 85.5%, a sensitivity of 0.897, and a specificity of 0.853. Age, vaccination status, neutrophil count, sodium levels, and platelet count were significant features. In the simplified mode, the extreme gradient boosting algorithm yielded an AUROC of 0.935, an accuracy of 89.9%, a sensitivity of 0.843, and a specificity of 0.902.ConclusionThis study illustrates the feasibility of constructing precise predictive models for severe outcomes or mortality in COVID-19 patients by leveraging significant predictors and advanced machine learning. These findings can aid healthcare practitioners in proactively predicting and monitoring severe outcomes or mortality among hospitalized COVID-19 patients, improving treatment and resource allocation.