Affiliation:
1. School of Life Sciences & Technology Tongji University Shanghai 200092 China
2. Changhai Hospital Second Military Medical University Shanghai 200433 China
3. Changzheng Hospital Second Military Medical University Shanghai 200003 China
Abstract
Early diagnosis of cancer is crucial to improving the long‐term survival rate of patients. However, commonly used tumor markers lack sensitivity and specificity for screening purposes. Herein, 10 diagnostic models for 10 common types of cancer are developed by extreme gradient boosting, incorporating 66 laboratory parameters. The datasets consist of a retrospective cohort of 737 503 training and 184 012 validation cases, and a prospective cohort of 174 894 cases for model testing. The areas under the curve of the 10 diagnostic models range from 0.763 to 0.993. Notably, the different models have varying numbers of identical parameters among the 66 test features. Additionally, SHapley Additive exPlanation analysis reveals that 54 nontumor markers contributed significantly to the models. Cosine similarity analysis and clustering analysis demonstrate that some of the 10 cancers share common pathophysiological characteristics. Feature‐based inference graph models are thus performed and infer relationships between nontumor index parameters and cancers with strong correlations. In conclusion, a machine learning‐based pan‐cancer early warning system has been established in this study, which can guide doctors in selecting more accurate testing indicators and assessing the risk of 10 types of cancer with greater precision.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献