Author:
Yoon Seowon,Jang Jihee,Son Gaeun,Park Soohyun,Hwang Jueun,Choeh Joon Yeon,Choi Kee-Hong
Abstract
IntroductionWith rapid advancements in natural language processing (NLP), predicting personality using this technology has become a significant research interest. In personality prediction, exploring appropriate questions that elicit natural language is particularly important because questions determine the context of responses. This study aimed to predict levels of neuroticism—a core psychological trait known to predict various psychological outcomes—using responses to a series of open-ended questions developed based on the five-factor model of personality. This study examined the model’s accuracy and explored the influence of item content in predicting neuroticism.MethodsA total of 425 Korean adults were recruited and responded to 18 open-ended questions about their personalities, along with the measurement of the Five-Factor Model traits. In total, 30,576 Korean sentences were collected. To develop the prediction models, the pre-trained language model KoBERT was used. Accuracy, F1 Score, Precision, and Recall were calculated as evaluation metrics.ResultsThe results showed that items inquiring about social comparison, unintended harm, and negative feelings performed better in predicting neuroticism than other items. For predicting depressivity, items related to negative feelings, social comparison, and emotions showed superior performance. For dependency, items related to unintended harm, social dominance, and negative feelings were the most predictive. DiscussionWe identified items that performed better at neuroticism prediction than others. Prediction models developed based on open-ended questions that theoretically aligned with neuroticism exhibited superior predictive performance.