Scalable and Generalizable Social Bot Detection through Data Selection-Reference-Cited by-同舟云学术

Scalable and Generalizable Social Bot Detection through Data Selection

Published:2020-04-03 Issue:01 Volume:34 Page:1096-1103
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Yang Kai-Cheng,Varol Onur,Hui Pik-Mai,Menczer Filippo

Abstract

Efficient and reliable social bot classification is crucial for detecting information manipulation on social media. Despite rapid development, state-of-the-art bot detection models still face generalization and scalability challenges, which greatly limit their applications. In this paper we propose a framework that uses minimal account metadata, enabling efficient analysis that scales up to handle the full stream of public tweets of Twitter in real time. To ensure model accuracy, we build a rich collection of labeled datasets for training and validation. We deploy a strict validation system so that model performance on unseen datasets is also optimized, in addition to traditional cross-validation. We find that strategically selecting a subset of training data yields better model accuracy and generalization than exhaustively training on all available data. Thanks to the simplicity of the proposed model, its logic can be interpreted to provide insights into social bot characteristics.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 163 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Unsupervised selective labeling for semi-supervised industrial defect detection;Journal of King Saud University - Computer and Information Sciences;2024-10

2. Not our kind of crowd! How partisan bias distorts perceptions of political bots on Twitter (now X);British Journal of Social Psychology;2024-08-29

3. Unsupervised Social Bot Detection via Structural Information Theory;ACM Transactions on Information Systems;2024-08-19

4. Coarse-to-fine label propagation with hybrid representation for deep semi-supervised bot detection;Wireless Networks;2024-08-14

5. Amplifying Hate: Mapping the Political Twitter Ecosystem and Toxic Enablers in Greece;Social Media and Modern Society [Working Title];2024-07-23