Prediction models with survival data: a comparison between machine learning and the Cox proportional hazards model-Reference-Cited by-同舟云学术

Prediction models with survival data: a comparison between machine learning and the Cox proportional hazards model

Published:2022-04-02 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Hazewinkel Audinga-Dea^ORCID,Gelderblom Hans,Fiocco Marta

Abstract

Recent years have seen increased interest in using machine learning (ML) methods for survival prediction, chiefly using big datasets with mixed datatypes and/or many predictors Model comparisons have frequently been limited to performance measure evaluation, with the chosen measure often suboptimal for assessing survival predictive performance. We investigated ML model performance in an application to osteosarcoma data from the EURAMOS-1 clinical trial (NCT00134030). We compared the performance of survival neural networks (SNN), random survival forests (RSF) and the Cox proportional hazards model. Three performance measures suitable for assessing survival model predictive performance were considered: the C-index, and the time-dependent Brier and Kullback-Leibler scores. Comparisons were also made on predictor importance and patient-specific survival predictions. Additionally, the effect of ML model hyper-parameters on performance was investigated. All three models had comparable performance as assessed by the C-index and Brier and Kullback-Leibler scores, with the Cox model and SNN also comparable in terms of relative predictor importance and patient-specific survival predictions. RSFs showed a tendency for according less importance to predictors with uneven class distributions and predicting clustered survival curves, the latter a result of tuning hyperparameters that influence forest shape through restrictions on terminal node size and tree depth. SNNs were comparatively more sensitive to hyperparameter misspecification, with decreased regularization resulting in inconsistent predicted survival probabilities. We caution against using RSF for predicting patient-specific survival, as standard model tuning practices may result in aggregated predictions, which is not reflected in performance measure values, and recommend performing multiple reruns of SNNs to verify prediction consistency.

Publisher

Cold Spring Harbor Laboratory

Reference33 articles.

1. A time-dependent discrimination index for survival data

2. Primary chemotherapy and delayed surgery (neoadjuvant chemotherapy) for osteosarcoma of the extremities the istituto rizzoli experience in 127 patients treated preoperatively with intravenous methotrexate (high versus moderate doses) and intraarterial cisplatin

3. Prognostic factors in high-grade osteosarcoma of the extremities or trunk: an analysis of 1,702 patients treated on neoadjuvant cooperative osteosarcoma study group protocols;Journal of clinical oncology : official journal of the American Society of Clinical Oncology,2002

4. Feed forward neural networks for the analysis of censored survival data: a partial logistic regression approach

5. Bishop, M. (2006). Pattern recognition and machine learning. Springer.