Abstract
Understanding and identifying the markers and clinical information that are associated with colorectal cancer (CRC) patient survival is needed for early detection and diagnosis. In this work, we aimed to build a simple model using Cox proportional hazards (PH) and random survival forest (RSF) and find a robust signature for predicting CRC overall survival. We used stepwise regression to develop Cox PH model to analyse 54 common differentially expressed genes from three mutations. RSF is applied using log-rank and log-rank-score based on 5000 survival trees, and therefore, variables important obtained to find the genes that are most influential for CRC survival. We compared the predictive performance of the Cox PH model and RSF for early CRC detection and diagnosis. The results indicate that SLC9A8, IER5, ARSJ, ANKRD27, and PIPOX genes were significantly associated with the CRC overall survival. In addition, age, sex, and stages are also affecting the CRC overall survival. The RSF model using log-rank is better than log-rank-score, while log-rank-score needed more trees to stabilize. Overall, the imputation of missing values enhanced the model’s predictive performance. In addition, Cox PH predictive performance was better than RSF.
Funder
GSK Africa Non-Communicable Disease Open Lab
Publisher
Public Library of Science (PLoS)
Reference56 articles.
1. Worldwide burden of colorectal cancer: a review;P Favoriti;Updates Surg,2016
2. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.;H Sung;CA: a cancer journal for clinicians.,2021
3. Identification of key genes for predicting colorectal cancer prognosis by integrated bioinformatics analysis;GP Dai;Oncology letters,2020
4. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.;F Bray;CA: a cancer journal for clinicians.,2018
Cited by
13 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献