Affiliation:
1. Islamic Azad University Science and Research Branch
2. Iran University of Science and Technology
Abstract
Abstract
One of the infectious diseases that were first identified in late 2019 in Wuhan, China, is COVID-19, which has killed many people around the world. Understanding the available COVID-19 data sets can help healthcare professionals identify some cases at an early stage. This paper proposes an innovative pipeline-based framework to predict death or survival from the COVID-19 disease on the Covid-19MPD dataset. Preprocessing, in the proposed framework, is an important part of achieving a high-quality result. Various machine learning models with optimal hyper parameters are implemented in the proposed framework. Using the same experimental conditions and data set, multiple experiments were performed with different combinations of preprocessing and models to maximize the AUC for predicting COVID-19 disease. Because the dimensions of the data were relatively large, and however, must find features that have an impact on death or survival from COVID-19. Feature dimensions' reduction methods such as PCA, ICA, and feature selection methods such as maximum relevance minimum redundancy, and permutation feature importance were used. Finding the feathers that have a great impact on the death or survival of the patient can help experts in the treatment of this disease and be able to control and ultimately treat this disease more efficiently. After various experiments of the proposed framework with standardized data and AUC with four components with the k-nearest neighbor algorithm, compared to other experiments, it was able to attain the optimal result in terms of AUC (100%).Because of the optimal nature of this framework in predicting COVID-19, it can be used in the smart systems of medical centers.
Publisher
Research Square Platform LLC