Affiliation:
1. Cornell University, Ithaca, New York
Abstract
This article considers the problem of estimating a power-law degree distribution of an undirected network using sampled data. Although power-law degree distributions are ubiquitous in nature, the widely used parametric methods for estimating them (e.g., linear regression on double-logarithmic axes and maximum likelihood estimation with uniformly sampled nodes) suffer from the large variance introduced by the lack of data-points from the tail portion of the power-law degree distribution. As a solution, we present a novel maximum likelihood estimation approach that exploits the
friendship paradox
to sample more efficiently from the tail of the degree distribution. We analytically show that the proposed method results in a smaller bias, variance and a Cramèr–Rao lower bound compared to the vanilla maximum likelihood estimate obtained with uniformly sampled nodes (which is the most commonly used method in literature). Detailed numerical and empirical results are presented to illustrate the performance of the proposed method under different conditions and how it compares with alternative methods. We also show that the proposed method and its desirable properties (i.e., smaller bias, variance, and Cramèr–Rao lower bound compared to vanilla method based on uniform samples) extend to parametric degree distributions other than the power-law such as exponential degree distributions as well. All the numerical and empirical results are reproducible and the code is publicly available on Github.
Funder
U. S. Army Research Office
National Science Foundation
Publisher
Association for Computing Machinery (ACM)
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献