Affiliation:
1. Department of General Surgery, Tianjin Medical University General Hospital, Tianjin Medical University, Tianjin 300052, China
2. College of Letters and Science, University of California, Berkeley, CA 94720, USA
Abstract
This study is aimed at constructing a prognostic risk model for colorectal cancer (CRC) using machine-learning algorithms to provide accurate staging and screening of credible prognostic risk genes. We extracted CRC data from GSE126092 and GSE156355 of the Gene Expression Omnibus (GEO) database and datasets from TCGA to analyze the differentially expressed genes (DEGs) using bioinformatics analysis. Among the 330 shared DEGs related to CRC prognosis, we divided the analysis period into different phases and applied univariate COX regression, LASSO, and multivariate COX regression analysis. GO analysis and KEGG analysis revealed that the functions of these DEGs were primarily focused on cell cycle, DNA replication, cell mitosis, and other related functions, and this confirmed our results from a biological perspective. Finally, a prognostic risk model for CRC based on the CHGA, CLU, PLK1, AXIN2, NR3C2, IL17RB, GCG, and AJUBA genes was constructed, and the risk score enabled us to predict the prognosis for CRC. To obtain a comprehensive and accurate model, we used both internal and external evaluations, and the model was able to correctly differentiate patients with CRC into a high-risk group with poor prognosis and a low-risk group with good prognosis. The AUC values of the 3-, 5-, and 10-year survival ROC curves were 0.715, 0.721, and 0.777, respectively, according to the internal evaluation, and the AUC values were 0.606, 0.698, and 0.608, respectively, for the external evaluation using GSE39582 from the GEO database. We determined that CLU, PLK1, and IL17RB could be considered to be independent prognostic factors for CRC with significantly different expression (
). Using machine-learning methods, a prognostic risk model comprised of eight genes was constructed. Not only does this model provide improved treatment guidance, but it also provides a novel perspective for analyzing survival conditions at a deeper biological level.
Funder
National Natural Science Foundation of China
Subject
Applied Mathematics,General Immunology and Microbiology,General Biochemistry, Genetics and Molecular Biology,Modeling and Simulation,General Medicine