Affiliation:
1. School of Computing, Newcastle University, Newcastle upon Tyne NE4 5TG, UK
2. St. Petersburg Federal Research Center of the Russian Academy of Sciences, 14th Line of V.O. 39, St. Petersburg 199178, Russia
Abstract
This paper aims to test the hypothesis that the quality of social media bot detection systems based on supervised machine learning may not be as accurate as researchers claim, given that bots have become increasingly sophisticated, making it difficult for human annotators to detect them better than random selection. As a result, obtaining a ground-truth dataset with human annotation is not possible, which leads to supervised machine-learning models inheriting annotation errors. To test this hypothesis, we conducted an experiment where humans were tasked with recognizing malicious bots on the VKontakte social network. We then compared the “human” answers with the “ground-truth” bot labels (‘a bot’/‘not a bot’). Based on the experiment, we evaluated the bot detection efficiency of annotators in three scenarios typical for cybersecurity but differing in their detection difficulty as follows: (1) detection among random accounts, (2) detection among accounts of a social network ‘community’, and (3) detection among verified accounts. The study showed that humans could only detect simple bots in all three scenarios but could not detect more sophisticated ones (p-value = 0.05). The study also evaluates the limits of hypothetical and existing bot detection systems that leverage non-expert-labelled datasets as follows: the balanced accuracy of such systems can drop to 0.5 and lower, depending on bot complexity and detection scenario. The paper also describes the experiment design, collected datasets, statistical evaluation, and machine learning accuracy measures applied to support the results. In the discussion, we raise the question of using human labelling in bot detection systems and its potential cybersecurity issues. We also provide open access to the datasets used, experiment results, and software code for evaluating statistical and machine learning accuracy metrics used in this paper on GitHub.
Reference31 articles.
1. Dialektakis, G., Dimitriadis, I., and Vakali, A. (2022). CALEB: A Conditional Adversarial Learning Framework to Enhance Bot Detection. arXiv.
2. Cresci, S., Petrocchi, M., Spognardi, A., and Tognazzi, S. (July, January 30). Better Safe Than Sorry: An Adversarial Approach to Improve Social Bot Detection. Proceedings of the 10th ACM Conference on Web Science, Boston, MA, USA.
3. Detecting Malicious Social Bots Based on Clickstream Sequences;Shi;IEEE Access,2019
4. A decade of social bot detection;Cresci;Commun. ACM,2020
5. Kolomeets, M., and Chechulin, A. (2021, January 12–14). Analysis of the malicious bots market. Proceedings of the 2021 29th Conference of Open Innovations Association (FRUCT), Tampere, Finland.