Abstract
Abstract
Background
COVID-19, caused by SARS-CoV-2, presents distinct diagnostic challenges due to its wide range of clinical manifestations and the overlapping symptoms with other common respiratory diseases. This study focuses on addressing these difficulties by employing machine learning (ML) methodologies, particularly the XGBoost algorithm, to utilize Complete Blood Count (CBC) parameters for predictive analysis.
Methods
We performed a retrospective study involving 2114 COVID-19 patients treated between December 2022 and January 2023 at our healthcare facility. These patients were classified into fever (1057 patients) and pneumonia groups (1057 patients), based on their clinical symptoms. The CBC data were utilized to create predictive models, with model performance evaluated through metrics like Area Under the Receiver Operating Characteristics Curve (AUC), accuracy, sensitivity, specificity, and precision. We selected the top 10 predictive variables based on their significance in disease prediction. The data were then split into a training set (70% of patients) and a validation set (30% of patients) for model validation.
Results
We identified 31 indicators with significant disparities. The XGBoost model outperformed others, with an AUC of 0.920 and high precision, sensitivity, specificity, and accuracy. The top 10 features (Age, Monocyte%, Mean Platelet Volume, Lymphocyte%, SIRI, Eosinophil count, Platelet count, Hemoglobin, Platelet Distribution Width, and Neutrophil count.) were crucial in constructing a more precise predictive model. The model demonstrated strong performance on both training (AUC = 0.977) and validation (AUC = 0.912) datasets, validated by decision curve analysis and calibration curve.
Conclusion
ML models that incorporate CBC parameters offer an innovative and effective tool for data analysis in COVID-19. They potentially enhance diagnostic accuracy and the efficacy of therapeutic interventions, ultimately contributing to a reduction in the mortality rate of this infectious disease.
Funder
Zhejiang Medicine and Health Scientific Research Project
Publisher
Springer Science and Business Media LLC
Subject
Public Health, Environmental and Occupational Health
Reference30 articles.
1. Mackenzie JS, Smith DW. COVID-19: a novel zoonotic disease caused by a coronavirus from China: what we know and what we don’t. Microbiol Aust. 2020:MA20013. https://doi.org/10.1071/MA20013. ahead of print.
2. Strategy, Policy Working Group for Ncip Epidemic Response CCfDC, Prevention. Interim guidelines for prevention and control of COVID-19 for oversea returnees. Zhonghua Liu Xing Bing Xue Za Zhi. 2020;41(8):1197–8.
3. Zahorec R. Neutrophil-to-lymphocyte ratio, past, present and future perspectives. Bratisl Lek Listy. 2021;122(7):474–88.
4. Buonacera A, Stancanelli B, Colaci M, Malatino L. Neutrophil to lymphocyte ratio: an emerging marker of the relationships between the immune system and diseases. Int J Mol Sci. 2022;23(7):3636.
5. Dey D, Slomka PJ, Leeson P, Comaniciu D, Shrestha S, Sengupta PP, Marwick TH. Artificial intelligence in cardiovascular imaging: JACC State-of-the-art review. J Am Coll Cardiol. 2019;73(11):1317–35.