Author:
Achimescu Vlad,Sultanescu Dan
Abstract
The authenticity of public debate is challenged by the emergence of networks of non-genuine users (such as political bots and trolls) employed and maintained by governments to influence public opinion. To tackle this issue, researchers have developed algorithms to automatically detect non-genuine users, but it is not clear how to identify relevant content, what features to use and how often to retrain classifiers. Users of online discussion boards who informally flag other users by calling them out as paid trolls provide potential labels of perceived propaganda in real time. Against this background, we test the performance of supervised machine learning models (regularized regression and random forests) to predict discussion board comments perceived as propaganda by users of a major Romanian online newspaper. Results show that precision and recall are relatively high and stable, and re-training the model on new labels does not improve prediction diagnostics. Overall, metadata (particularly a low comment rating) are more predictive of perceived propaganda than textual features. The method can be extended to monitor suspicious activity in other online environments, but the results should not be interpreted as detecting actual propaganda.
Publisher
University of Illinois Libraries
Subject
Computer Networks and Communications,Human-Computer Interaction
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献