BACKGROUND
The dramatic increase in adolescent obesity is a serious public health crisis in the world. The World Health Organization has projected that by 2030, adolescent obesity will reach 254 million children worldwide. Increasing evidences show that obesity in adolescence would increase the risk of type 2 diabetes and cardiovascular disease in adulthood. A prediction model for adolescent obesity could help clinicians and adolescents monitor, control, and identify risk factors before children become overweight, enabling more personalized healthy lifestyle improvement for adolescent obesity.
OBJECTIVE
This study aims to develop a risk prediction model for adolescents using lifestyle factors, living environment data, and health literacy for the prediction of becoming overweight and obese in the upcoming month, and to explore living environment and lifestyle factors that may predispose youth to overweight and obesity.
METHODS
This prospective study was conducted at National Taiwan University Hospital. Parents and eligible adolescents were enrolled in the study. Living environment and lifestyle factors were collected by a wearable device, a smartphone app, the open environmental data API, and a case management platform. Standardized questionnaires were designed to evaluate the health literacy value of adolescents. To analyze the large amounts of heterogeneous data, we implemented six machine learning models: Random Forest, Decision Tree, SVM, KNN, LDA, and AdaBoost, and used Shapely Additive exPlanations and feature selection process to find the most cost-effective feature set to account for the problem of incomplete data in the real world.
RESULTS
All data from 120 adolescents were collected prospectively during a mean 1-year follow-up. For the risk prediction, the proposed model produced the best performance an accuracy of 94.3%, precision of 99,9%, and F1 score of 78.8%. Overall, the accuracy of the test set was 81.6%-94.3% for six machine learning algorithms. After the process of feature selection, the combination of daily consumption in calories, health literacy value, average heart rate and minimum heart rate was identified as the most cost-effective feature set. The purposed model with only these four features could achieve an accuracy of 93.7%, sensitivity of 71%, precision of 88.8%, and F1 score of 78.6%.
CONCLUSIONS
In contrast with previous existing studies, the proposed model could yield reliable prediction of the risk of becoming overweight and obesity in adolescent by adding objective lifestyle and environmental data. Our results indicated that lower values for features such as health literacy, consumption in calories, average heart rate, and rapid eye movement time would increase the risk of becoming overweight and obese. This information would help adolescents understand exactly how to improve their lifestyle and health outcomes. Furthermore, we have constructed the most cost-effective model that only needs four features to complete the prediction task, which is very helpful for deploying the risk prediction model in real life.
CLINICALTRIAL
The study protocol was approved by the institutional review board of the National Taiwan University Hospital (201710066RINB).