Finding the Source in Networks: An Approach Based on Structural Entropy-Reference-Cited by-同舟云学术

Finding the Source in Networks: An Approach Based on Structural Entropy

Published:2023-02-28 Issue:1 Volume:23 Page:1-25
ISSN:1533-5399
Container-title:ACM Transactions on Internet Technology
language:en
Short-container-title:ACM Trans. Internet Technol.

Author:

Zhang Chong¹^ORCID,Guo Qiang¹^ORCID,Fu Luoyi¹^ORCID,Ding Jiaxin¹^ORCID,Cao Xinde¹^ORCID,Long Fei²^ORCID,Wang Xinbing¹^ORCID,Zhou Chenghu³^ORCID

Affiliation:

1. Shanghai Jiao Tong University, Shanghai, China

2. Xinhua News Agency, Beijing, China

3. Chinese Academy of Sciences, Beijing, China

Abstract

The popularity of intelligent devices provides straightforward access to the Internet and online social networks. However, the quick and easy data updates from networks also benefit the risk spreading, such as rumor, malware, or computer viruses. To this end, this article studies the problem of source detection, which is to infer the source node out of an aftermath of a cascade, that is, the observed infected graph G N of the network at some time. Prior arts have adopted various statistical quantities such as degree, distance, or infection size to reflect the structural centrality of the source. In this article, we propose a new metric that we call the infected tree entropy (ITE), to utilize richer underlying structural features for source detection. Our idea of ITE is inspired by the conception of structural entropy [ 21 ], which demonstrated that the minimization of average bits to encode the network structures with different partitions is the principle for detecting the natural or true structures in real-world networks. Accordingly, our proposed ITE based estimator for the source tries to minimize the coding of network partitions brought by the infected tree rooted at all the potential sources, thus minimizing the structural deviation between the cascades from the potential sources and the actual infection process included in G N . On polynomially growing geometric trees, with increasing tree heterogeneity, the ITE estimator remarkably yields more reliable detection under only moderate infection sizes, and returns an asymptotically complete detection. In contrast, for regular expanding trees, we still observe guaranteed detection probability of ITE estimator even with an infinite infection size, thanks to the degree regularity property. We also algorithmically realize the ITE based detection that enjoys linear time complexity via a message-passing scheme, and further extend it to general graphs. Extensive experiments on synthetic and real datasets confirm the superiority of ITE to the baselines. For example, ITE returns an accuracy of 85%, ranking the source among the top 10%, far exceeding 55% of the classic algorithm on scale-free networks.

Funder

NSF China

100-Talents Program of Xinhua News Agency, and the Program of Shanghai Academic/Technology Research Leader

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications

Link

https://dl.acm.org/doi/pdf/10.1145/3568309

Reference41 articles.

1. Ameya Agaskar and Yue M. Lu. 2013. A fast Monte Carlo algorithm for source localization on graphs. In Proceeding of the Wavelets and Sparsity XV, Vol. 8858, 429–434.

2. Entropy measures for networks: Toward an information theory of complex topologies;Anand Kartik;Physical Review E,2009

3. Emergence of scaling in random networks;Barabási Albert-László;Science,1999

4. Entropy of network ensembles;Bianconi Ginestra;Physical Review E,2009

5. Information theory, distance matrix, and molecular branching;Bonchev D.;The Journal of Chemical Physics,1977

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Disinformation detection using graph neural networks: a survey;Artificial Intelligence Review;2024-02-14

2. Evidence-Aware Fake News Detection: A Review;2023 International Conference on Advanced Computing & Communication Technologies (ICACCTech);2023-12-23