Affiliation:
1. Department of Computer Science, University of Illinois at Urbana–Champaign
Abstract
This article narrows the gap between physical sensing systems that measure
physical signals
and social sensing systems that measure
information signals
by (i) defining a novel algorithm for extracting information signals (building on results from text embedding) and (ii) showing that it increases the accuracy of truth discovery—the separation of true information from false/manipulated one. The work is applied in the context of separating true and false facts on social media, such as Twitter and Reddit, where users post predominantly short microblogs. The new algorithm decides how to
aggregate
the signal across words in the microblog for purposes of clustering the miscroblogs in the latent information signal space, where it is easier to separate true and false posts. Although previous literature extensively studied the problem of short text embedding/representation, this article improves previous work in three important respects: (1) Our work constitutes
unsupervised
truth discovery, requiring no labeled input or prior training. (2) We propose a new distance metric for efficient short text similarity estimation, we call
Semantic Subset Matching
, that improves our ability to meaningfully cluster microblog posts in the latent information signal space. (3) We introduce an iterative framework that jointly improves miscroblog clustering and truth discovery. The evaluation shows that the approach improves the accuracy of truth-discovery by 6.3%, 2.5%, and 3.8% (constituting a 38.9%, 14.2%, and 18.7% reduction in error, respectively) in three real Twitter data traces.
Funder
DARPA
Basic Research Office
Army Research Laboratory under Cooperative Agreement
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications
Reference98 articles.
1. 2015. Apollo Social Sensing Toolkit. Retrieved from http://apollo2.cs.illinois.edu.
2. Integrating Sensors and Social Networks
3. A Survey on Localization for Mobile Wireless Sensor Networks
4. Latent dirichlet allocation;Blei David M.;Journal of machine Learning research,2003
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. The voice of silence;Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining;2021-11-08