Abstract
Background
Given the established links between an individual’s behaviors and lifestyle factors and potentially adverse health outcomes, univariate or simple multivariate health metrics and scores have been developed to quantify general health at a given point in time and estimate risk of negative future outcomes. However, these health metrics may be challenging for widespread use and are unlikely to be successful at capturing the broader determinants of health in the general population. Hence, there is a need for a multidimensional yet widely employable and accessible way to obtain a comprehensive health metric.
Objective
The objective of the study was to develop and validate a novel, easily interpretable, points-based health score (“C-Score”) derived from metrics measurable using smartphone components and iterations thereof that utilize statistical modeling and machine learning (ML) approaches.
Methods
A literature review was conducted to identify relevant predictor variables for inclusion in the first iteration of a points-based model. This was followed by a prospective cohort study in a UK Biobank population for the purposes of validating the C-Score and developing and comparatively validating variations of the score using statistical and ML models to assess the balance between expediency and ease of interpretability and model complexity. Primary and secondary outcome measures were discrimination of a points-based score for all-cause mortality within 10 years (Harrell c-statistic) and discrimination and calibration of Cox proportional hazards models and ML models that incorporate C-Score values (or raw data inputs) and other predictors to predict the risk of all-cause mortality within 10 years.
Results
The study cohort comprised 420,560 individuals. During a cohort follow-up of 4,526,452 person-years, there were 16,188 deaths from any cause (3.85%). The points-based model had good discrimination (c-statistic=0.66). There was a 31% relative reduction in risk of all-cause mortality per decile of increasing C-Score (hazard ratio of 0.69, 95% CI 0.663-0.675). A Cox model integrating age and C-Score had improved discrimination (8 percentage points; c-statistic=0.74) and good calibration. ML approaches did not offer improved discrimination over statistical modeling.
Conclusions
The novel health metric (“C-Score”) has good predictive capabilities for all-cause mortality within 10 years. Embedding the C-Score within a smartphone app may represent a useful tool for democratized, individualized health risk prediction. A simple Cox model using C-Score and age balances parsimony and accuracy of risk predictions and could be used to produce absolute risk estimations for app users.