Characterizing Directed and Undirected Networks via Multidimensional Walks with Jumps

Author:

Murai Fabricio1ORCID,Ribeiro Bruno2,Towlsey Don3,Wang Pinghui4

Affiliation:

1. Universidade Federal de Minas Gerais, Belo Horizonte, Brazil

2. Purdue University, West Lafayette, IN, USA

3. University of Massachusetts Amherst, Amherst, MA, USA

4. Xi’an Jiaotong University, Xi’an, China

Abstract

Estimating distributions of node characteristics (labels) such as number of connections or citizenship of users in a social network via edge and node sampling is a vital part of the study of complex networks. Due to its low cost, sampling via a random walk (RW) has been proposed as an attractive solution to this task. Most RW methods assume either that the network is undirected or that walkers can traverse edges regardless of their direction. Some RW methods have been designed for directed networks where edges coming into a node are not directly observable. In this work, we propose Directed Unbiased Frontier Sampling (DUFS), a sampling method based on a large number of coordinated walkers, each starting from a node chosen uniformly at random. It applies to directed networks with invisible incoming edges because it constructs, in real time, an undirected graph consistent with the walkers trajectories, and its use of random jumps to prevent walkers from being trapped. DUFS generalizes previous RW methods and is suited for undirected networks and to directed networks regardless of in-edge visibility. We also propose an improved estimator of node label distribution that combines information from initial walker locations with subsequent RW observations. We evaluate DUFS, compare it to other RW methods, investigate the impact of its parameters on estimation accuracy and provide practical guidelines for choosing them. In estimating out-degree distributions, DUFS yields significantly better estimates of the head of the distribution than other methods, while matching or exceeding estimation accuracy of the tail. Last, we show that DUFS outperforms uniform sampling when estimating distributions of node labels of the top 10% largest degree nodes, even when sampling a node uniformly has the same cost as RW steps.

Funder

National Council for Scientific and Technological Development - Brazil

Army Research Laboratory under Cooperative Agreement

Universidade Federal de Minas Gerais under the Programa Institucional de Auxílio à Pesquisa de Docentes Recém-Contratados

MURI ARO

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Reference38 articles.

1. On the bias of traceroute sampling

2. Network Sampling

3. Nesreen K. Ahmed Jennifer Neville and Ramana Rao Kompella. 2012. Network sampling designs for relational classification. In ICWSM. Nesreen K. Ahmed Jennifer Neville and Ramana Rao Kompella. 2012. Network sampling designs for relational classification. In ICWSM.

4. Improving Random Walk Estimation Accuracy with Uniform Restarts

5. Random sampling from a search engine's index

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Sampling Based Estimation of In-Degree Distribution for Directed Complex Networks;Journal of Computational and Graphical Statistics;2021-03-12

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3