Author:
Shimokawa Asanao,Narita Yoshitaka,Shibui Soichiro,Miyaoka Etsuo
Abstract
AbstractIn many scenarios, a patient in medical research is treated as a statistical unit. However, in some scenarios, we are interested in treating aggregate data as a statistical unit. In such situations, each set of aggregated data is considered to be a concept in a symbolic representation, and each concept has a hyperrectangle or multiple points in the variable space. To construct a tree-structured model from these aggregate survival data, we propose a new approach, where a datum can be included in several terminal nodes in a tree. By constructing a model under this condition, we expect to obtain a more flexible model while retaining the interpretive ease of a hierarchical structure. In this approach, the survival function of concepts that are partially included in a node is constructed using the Kaplan-Meier method, where the number of events and risks at each time point is replaced by the expectation value of the number of individual descriptions of concepts. We present an application of this proposed model using primary brain tumor patient data. As a result, we obtained a new interpretation of the data in comparison to the classical survival tree modeling methods.
Subject
Statistics, Probability and Uncertainty,General Medicine,Statistics and Probability
Reference44 articles.
1. Classification and regression trees on aggregate data modeling: an application in acute myocardial infarction;J Probab Stat,2011
2. Decision trees on interval valued variables;Elect J Symbolic Data Anal,2005
3. Regression trees for censored data;Biometrics,1988
4. From the statistics of data to the statistics of knowledge: symbolic data analysis;J Am Stat Assoc,2003