Abstract
There has been a great interest in investigating relations between personality and language use on the web or social media. Most of the recent studies are based on mining the users’ information available online and then using machine learning algorithms to predict their personality characteristics. On the other hand, a few studies have relied on the traditional lexical hypothesis when exploring personality under the assumption that personality-related attributes could be obtained from dictionaries. However, little is known about personality structure from Twitter - do data strictly reflect personality structure as represented by personality models, or as unique personality semantic patterns. The aim of the study was to assess and interpret the personality adjective-based structure contained in tweets. The data were collected from an open-access „Tweet-sr“ Serbian Twitter linguistic corpus (Ljubešić & Klubička, 2014). Latent Dirichlet Allocation, a topic modeling technique, was conducted to extract topics and cosine similarity was used as a measure to determine topic similarities, as well as topic-personality dimensions’s similarities. The results showed that the optimal solution comprised four non-overlapping topics reflecting specific semantic structures. Topics did not replicate trait constructs but were modestly related to them. The largest similarities were found with Extraversion and Agreeableness, pointing out the conceptual importance of these traits when describing interpersonal behavior. Also, no inter-topic differences in category distributions were found, with the evaluation terms being the second most frequent in three topics. Although tweets are short-form text messages, they have the potential to communicate socially relevant information through personality descriptors.
Publisher
Faculty of Philosophy, University of Novi Sad