Affiliation:
1. Department of Mathematics University of Texas at Arlington Arlington Texas USA
Abstract
AbstractThe mixture cure model is widely used to analyze survival data in the presence of a cured subgroup. Standard logistic regression‐based approaches to model the incidence may lead to poor predictive accuracy of cure, specifically when the covariate effect is non‐linear. Supervised machine learning techniques can be used as a better classifier than the logistic regression due to their ability to capture non‐linear patterns in the data. However, the problem of interpret‐ability hangs in the balance due to the trade‐off between interpret‐ability and predictive accuracy. We propose a new mixture cure model where the incidence part is modeled using a decision tree‐based classifier and the proportional hazards structure for the latency part is preserved. The proposed model is very easy to interpret, closely mimics the human decision‐making process, and provides flexibility to gauge both linear and non‐linear covariate effects. For the estimation of model parameters, we develop an expectation maximization algorithm. A detailed simulation study shows that the proposed model outperforms the logistic regression‐based and spline regression‐based mixture cure models, both in terms of model fitting and evaluating predictive accuracy. An illustrative example with data from a leukemia study is presented to further support our conclusion.
Subject
Statistics and Probability,Epidemiology
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献