BACKGROUND
Prediction of medical outcomes may potentially benefit from using advanced statistical modeling techniques. How-
ever, these techniques do not always perform better than traditional techniques such as regression techniques. We aimed to externally
validate five modeling strategies for the prediction of the disability of community-dwelling older people in the Netherlands.
OBJECTIVE
External validation of prediction models in predicting disability of community-dwelling older people in the Netherlands.
METHODS
We analyzed individual patient data from five studies including community-dwelling older people in the Netherlands.
We considered a set of fourteen binary predictors as measured with the Tilburg Frailty Indicator (TFI). With this set, we predicted
the continuous total disability score as measured with the Groningen Activity Restriction Scale (GARS) using five statistical
modeling techniques: general linear model (GLM), support vector machine (SVM), neural net (NN), recursive partitioning (RP),
and random forest (RF). For external validation, we developed a model on one of the five data sets and then applied the model to each
of the four remaining data sets. This process was repeated five times for a total of twenty validations. Calibration characteristics,
the correlation coefficient, and the root of the mean squared error were used to assess the performance of the models.
RESULTS
All models, except the NN model, showed satisfactory performance characteristics when validated on the validation data
sets. The use of a deviating data set for the development and the validation of the models lead to poor performance characteristics
for all models due to the deviating baseline characteristics of that data set compared to the baseline characteristics of the other
data sets.
CONCLUSIONS
All models showed satisfactory performance characteristics on the development data sets. The performance of the
models GLM, SVM, RP and RF on the validation data sets was also satisfactory, except when the models were developed on the
data set with deviating baseline characteristics compared to the characteristics of the other data sets we used in this study. The
performance of the NN models on the validation data sets was much worse compared to the initial performance on the development
data sets.
CLINICALTRIAL