Affiliation:
1. Institute of Industrial Science, The University of Tokyo, Japan
2. Institute for Health Economics and Policy, Japan
Abstract
Medical events are often infrequent, thus becomes hard to predict. In this paper, we focus on predictor that forecasts whether a medical event would occur in the next year, and analyzes the impact of event’s frequency and data size via predictor’s performance. In the experiment, we made 1572 predictors for medical events using Medical Insurance Claims (MICs) data from 800,000 participants and 205.8 million claims over 8 years. The result revealed that (a) forecasting error will be increased when predicting low-frequency events, and (b) increasing the number of training dataset reduces errors. This result suggests that increasing data size is a key to solve low frequency problems. However, we still need additional methods to cope with sparse and imbalanced data.