Abstract
Estimation and prediction of heterogeneous restricted mean survival time (hRMST) is of great clinical importance, which can provide an easily interpretable and clinically meaningful summary of the survival function in the presence of censoring and individual covariates. The existing methods for the modeling of hRMST rely on proportional hazards or other parametric assumptions on the survival distribution. In this paper, we propose a random forest based estimation of hRMST for right-censored survival data with covariates and prove a central limit theorem for the resulting estimator. In addition, we present a computationally efficient construction for the confidence interval of hRMST. Our simulations show that the resulting confidence intervals have the correct coverage probability of the hRMST, and the random forest based estimate of hRMST has smaller prediction errors than the parametric models when the models are mis-specified. We apply the method to the ovarian cancer data set from The Cancer Genome Atlas (TCGA) project to predict hRMST and show an improved prediction performance over the existing methods. A software implementation, srf using R and C++, is available at https://github.com/lmy1019/SRF.
Funder
National Institute of General Medical Sciences
Subject
Genetics(clinical),Genetics,Molecular Medicine
Reference35 articles.
1. Corrigendum: a pan-cancer proteomic perspective on the Cancer Genome Atlas;Akbani;Nat. Commun,2015
2. Cox's regression model for counting processes: a large sample study;Andersen;Ann. Stat,1982
3. Analysis of a random forests model;Biau;J. Mach. Learn. Res.,2012
4. Consistency of random forests and other averaging classifiers;Biau;J. Mach. Learn. Res.,2008
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献