Abstract
The World Health Organization (WHO) predicted that 10 million people would have died of cancer by 2020. According to recent studies, liver cancer is the most prevalent cancer worldwide. Hepatocellular carcinoma (HCC) is the leading cause of early-stage liver cancer. However, HCC occurs most frequently in patients with chronic liver conditions (such as cirrhosis). Therefore, it is important to predict liver cancer more explicitly by using machine learning. This study examines the survival prediction of a dataset of HCC based on three strategies. Originally, missing values are estimated using mean, mode, and k-Nearest Neighbor (k-NN). We then compare the different select features using the wrapper and embedded methods. The embedded method employs Least Absolute Shrinkage and Selection Operator (LASSO) and ridge regression in conjunction with Logistic Regression (LR). In the wrapper method, gradient boosting and random forests eliminate features recursively. Classification algorithms for predicting results include k-NN, Random Forest (RF), and Logistic Regression. The experimental results indicate that Recursive Feature Elimination with Gradient Boosting (RFE-GB) produces better results, with a 96.66% accuracy rate and a 95.66% F1-score.
Funder
Deanship of Scientific Research,King Faisal University
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献