BACKGROUND
Similarity-based machine-learning methodologies are suitable for personalized prediction and recommendation research, which is actively applied in healthcare field along with the generalization of EHR data. In particular, the similarity learning model which carefully reflects age can be efficiently used in predicting chronic diseases, closely related to ageing.
OBJECTIVE
We aimed to design a similarity model for patients in different age-groups in order to predict the two major chronic diseases: Diabetes and Hypertension.
METHODS
We developed an idea about learning the overlapping periods of two individuals by moving the viewpoint of them to future and past respectively. From this idea, we build separated similarity learning models through three sequential age-group intervals; 30-40, 40-50, 50-60 age-groups intervals. Each model has same structure based on deep neural network. For similarity learning, we set several demographic/bi-annual check-up information and diagnosis records as input features and disease based yes-or-no similarity labels as output features.
RESULTS
As a result of applying hypertension patients’ pair, diabetes patients’ pair, and non-diabetes/diabetes patient pair to our methodology, the similarity value was very high, close to 1 in the former two cases, and the similarity value was low, close to zero, in the last case. This proves that similarity learning appropriately reflects the disease status between individuals. In addition, we tried to find out how the conventional single-timepoint methodology and our methodology differ in the measurement of similarity for several special cases in which the patient's disease condition changes. As a result, it was found that the similarity results between the existing methodology and our methodology differ from at least 0.2 to at most 0.9 in four special cases where the patient's condition changes. This suggests that our methodology responds more sensitively to the patient's condition changing over time and can be applied more efficiently to disease prediction in those cases.
CONCLUSIONS
We developed an age-sensitive similarity learning model for personalized prediction of chronic diseases targeting Koreans. As a result, for the cases that patient's disease pattern changes, by designing and learning a deep similarity learning model using divided age groups which has not been previously attempted, we have shown that similarity learning results are better than conventional single-timepoint methodology. Moreover, we proposed the possibility of overcoming data shortage limitations that occur frequently in medical datasets through a similarity learning model considering patients’ age differences.