BACKGROUND
In some countries, including Japan—the leading country in terms of longevity, life expectancy has been increasing; meanwhile, healthy life years have not kept pace, necessitating an effective health policy to narrow the gap.
OBJECTIVE
The aim of this study is to develop a prediction model for healthy life years without activity limitations and deploy the model in a health policy to prolong healthy life years.
METHODS
The Comprehensive Survey of Living Conditions, a cross-sectional national survey of Japan, was conducted by the Japanese Ministry of Health, Labour and Welfare in 2013, 2016, and 2019. The data from 1,537,773 responders were used for modelling using machine learning. All participants were randomly split into training (n=1,383,995, 90%,) and test (n=153,778, 10%) subsets. Extreme gradient boosting classifier was implemented. Activity limitations were set as the target. Age, sex, and 40 types of diseases or injuries were included as features. Healthy life years without activity limitations were calculated by incorporating the predicted prevalence rate of activity limitations in a life table. For the wide utility of the model in individuals, we developed an application tool for the model.
RESULTS
In the groups without (n=1,329,901) and with (n=207,872) activity limitations, the median age was 47 (IQR 30-64) and 69 (IQR 54-80) years, respectively (<i>P</i><.001); female sex comprised 51.3% (n=681,794) in the group without activity limitations and 56.9% (n=118,339) in the group with activity limitations (<i>P</i><.001). A total of 42 features were included in the feature set. Age had the highest impact on model accuracy, followed by depression or other mental diseases; back pain; bone fracture; other neurological disorders, pain, or paralysis; stroke, cerebral hemorrhage, or infarction; arthritis; Parkinson disease; dementia; and other injuries or burns. The model exhibited high performance with an area under the receiver operating characteristic curve of 0.846 (95% CI 0.842-0.849) with exact calibration for the average probability and fraction of positives. The prediction results were consistent with the observed values of healthy life years for both sexes in each year (range of difference between predictive and observed values: −0.89 to 0.16 in male and 0.61 to 1.23 in female respondents). We applied the prediction model to a regional health policy to prolong healthy life years by adjusting the representative predictors to a target prevalence rate. Additionally, we presented the health condition without activity limitations index, followed by the application development for individual health promotion.
CONCLUSIONS
The prediction model will enable national or regional governments to establish an effective health promotion policy for risk prevention at the population and individual levels to prolong healthy life years. Further investigation is needed to validate the model’s adaptability to various ethnicities and, in particular, to countries where the population exhibits a short life span.