Affiliation:
1. Gachon University
2. Chungbuk National University
3. Semyung University
Abstract
Abstract
Background In recent years, the incidence of hypertension has increased dramatically in both the elderly and young populations. The incidence of hypertension also increased with the outbreak of the COVID-19 pandemic. The aims of this study to improve the prediction of hypertension detection using a multivariate outlier removal method based on the deep autoencoder (DAE) method on Korean national health data from the Korea National Health and Nutrition Examination Survey (KNHANES) database. Several studies have identified various risk factors for chronic hypertension. Chronic diseases are often multifactorial rather than single and have been identified to be associated with COVID-19. Therefore, it is necessary to study disease detection by considering complex factors.Methods This study was divided into two modules. The first module, data pre-processing, initially integrated external features for COVID-19 patients merged by region, age, and gender value for KHNANE-2020 year and Kaggle data. Following performed multicollinearity-based feature selection for the KNHANES dataset and integrated dataset. The next module uses the predictive analysis step to detect and predict hypertension based on OrdinalEncoder (OE) normalization and multivariate outlier removal using a deep autoencoder from KNHANES data.Results In this study, we compared the accuracy, F1 score, and area under the ROC curve (AUC) of each classification model. The experimental results showed that the proposed XGBoost model achieved the best results with an accuracy rate of 87.78%, an F1 score of 89.95%, and an AUC of 92.28% for COVID-19 cases, and an accuracy rate of 87.72%, an F1 score of 89.94%, and an AUC of 92.23% for non-COVID-19 cases with the DAE_OE model.Conclusions We successfully improved the prediction performance of the classifiers utilized in all of the experiments by developing a high-quality training dataset implementing DAE and OE in our proposed method. Moreover, we experimentally demonstrate how the steps of our proposed method improve performance. The proposed method can be used not only for hypertension but also for the detection of various diseases such as stroke and cardiovascular disease.
Publisher
Research Square Platform LLC
Reference29 articles.
1. Korea Centers for Disease Control & Prevention. http://knhanes.cdc.go.kr. Accessed: February 4, 2014.
2. A novel coronavirus outbreak of global health concern;Wang C;The lancet. 2020 Feb
3. World Health Organization. https://www.who.int/health-topics/hypertension/#tab=tab_1
4. Mahalanobis distance based multivariate outlier detection to improve performance of hypertension prediction;Dashdondov K;Neural Processing Letters. 2021 Nov
5. DHDIP: An interpretable model for hypertension and hyperlipidemia prediction based on EMR data. Computer Methods and Programs in Biomedicine;Liao B,2022