Abstract
AbstractThis study aimed to develop a classification model predicting incident bipolar disorder (BD) cases in young adults within a 5-year interval, using sociodemographic and clinical features from a large cohort study. We analyzed 1,091 individuals without BD, aged 18 to 24 years at baseline, and used the XGBoost algorithm with feature selection and oversampling methods. Forty-nine individuals (4.49%) received a BD diagnosis five years later. The best model had an acceptable performance (test AUC: 0.786, 95% CI: 0.686, 0.887) and included ten features: feeling of worthlessness, sadness, current depressive episode, selfreported stress, self-confidence, lifetime cocaine use, socioeconomic status, sex frequency, romantic relationship, and tachylalia. We performed a permutation test with 10,000 permutations that showed the AUC from the built model is significantly better than random classifiers. The results provide insights into BD as a latent phenomenon, as depression is its typical initial manifestation. Future studies could monitor subjects during other developmental stages and investigate risk populations to improve BD characterization. Furthermore, the usage of digital health data, biological, and neuropsychological information and also neuroimaging can help in the rise of new predictive models.
Publisher
Cold Spring Harbor Laboratory