Affiliation:
1. Chinese Academy of Medical Sciences and Peking Union Medical College, Peking Union Medical College Hospital
2. China Academy of Information and Communications Technology
3. Beijing Chao Yang Hospital and Capital Medical University
4. China-Japan Friendship Hospital
Abstract
Abstract
Background
Upper tract urothelial carcinoma (UTUC) is a rare malignant tumor within the urinary system. This study aimed to establish personalized models for predicting the 1-, 2-, 3-, and 5-year overall survival (OS) and cancer specific survival (CSS) of patients with UTUC.
Methods
Data of 2614 cases were obtained from the Surveillance, Epidemiology, and End Results database and randomly divided into training and test datasets (split ratio=0.7:0.3). Uni- and multivariable COX regression analysis, least absolute shrinkage and selection operator (Lasso) regression analysis, and a backward stepwise process were employed to identify independent predictors. The importance of predictors was further assessed using Shapley’s Additive Explanations (SHAP). Six machine learning-based predictive models were then established and evaluated by the area under the receiver operating characteristic curves (AUC), and web calculators were developed to enhance the practicality of the best-performing model. (Software: R 4.3.0 and Python 3.10).
Results
Multivariable analysis showed that other races (HR=0.82, P=0.031), non-hispanic (HR =0.75, P=0.011), and localized lesion (HR=0.70, P=0.001) were risk factors. Lasso identified “sex, annual household income, months from diagnosis to treatment, tumor grade, T stage, side of the primary tumor, examination of lymph nodes, radiotherapy and chemotherapy” as independent predictors. The variance inflation factor (VIF) for variables was less than 5. Among the six machine learning models, the Extreme Gradient Boosting (XGBoost) performed the best, exhibiting good AUC values in both the training dataset (OS: 0.752-0.767; CSS: 0.785-0.795) and the test dataset (OS: 0.691-0.768; CSS: 0.728-0.792).
Conclusions
Integrating XGBoost with predictive models shows promise, and web calculators may enhance the practicality of models. In clinical settings, clinicopathological (including pT, tumor grade, and the scope of lesions) and demographic factors (including race, sex, and annual family income) are crucial for UTUC prognosis assessment. Treatment strategiesshould consider lymph node examination, radiotherapy and chemotherapy.
Publisher
Research Square Platform LLC