Abstract
Background and Objective: Primary lung cancer is a lethal and rapidly-developing cancer type and is one of the most leading causes of cancer deaths. Materials and Methods: Statistical methods such as Cox regression are usually used to detect the prognosis factors of a disease. This study investigated survival prediction using machine learning algorithms. The clinical data of 28,458 patients with primary lung cancers were collected from the Surveillance, Epidemiology, and End Results (SEER) database. Results: This study indicated that the survival rate of women with primary lung cancer was often higher than that of men (p < 0.001). Seven popular machine learning algorithms were utilized to evaluate one-year, three-year, and five-year survival prediction The two classifiers extreme gradient boosting (XGB) and logistic regression (LR) achieved the best prediction accuracies. The importance variable of the trained XGB models suggested that surgical removal (feature “Surgery”) made the largest contribution to the one-year survival prediction models, while the metastatic status (feature “N” stage) of the regional lymph nodes was the most important contributor to three-year and five-year survival prediction. The female patients’ three-year prognosis model achieved a prediction accuracy of 0.8297 on the independent future samples, while the male model only achieved the accuracy 0.7329. Conclusions: This data suggested that male patients may have more complicated factors in lung cancer than females, and it is necessary to develop gender-specific diagnosis and prognosis models.
Funder
Jilin Provincial Key Laboratory of Big Data Intelligent Computing
Education Department of Jilin Province
Jilin University
Fundamental Research Funds for the Central Universities
Cited by
20 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献