Affiliation:
1. University of California, Santa Cruz, CA
2. University of New Mexico, Albuquerque, and the Santa Fe Institute, New Mexico
3. University of Southern California, Los Angeles, CA
Abstract
Understanding the graph structure of the Internet is a crucial step for building accurate network models and designing efficient algorithms for Internet applications. Yet, obtaining this graph structure can be a surprisingly difficult task, as edges cannot be explicitly queried. For instance, empirical studies of the network of Internet Protocol (IP) addresses typically rely on indirect methods like
traceroute
to build what are approximately single-source, all-destinations, shortest-path trees. These trees only sample a fraction of the network's edges, and a paper by Lakhina et al. [2003] found empirically that the resulting sample is intrinsically biased. Further, in simulations, they observed that the degree distribution under traceroute sampling exhibits a power law even when the underlying degree distribution is Poisson.
In this article, we study the bias of traceroute sampling mathematically and, for a very general class of underlying degree distributions, explicitly calculate the distribution that will be observed. As example applications of our machinery, we prove that traceroute sampling finds power-law degree distributions in both δ-regular and Poisson-distributed random graphs. Thus, our work puts the observations of Lakhina et al. on a rigorous footing, and extends them to nearly arbitrary degree distributions.
Funder
National Science Foundation
Division of Computing and Communication Foundations
European Research Council
Division of Physics
Publisher
Association for Computing Machinery (ACM)
Subject
Artificial Intelligence,Hardware and Architecture,Information Systems,Control and Systems Engineering,Software
Cited by
41 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. QClique: Optimizing Performance and Accuracy in Maximum Weighted Clique;Lecture Notes in Computer Science;2024
2. Partitioning Communication Streams Into Graph Snapshots;IEEE Transactions on Network Science and Engineering;2023-03-01
3. A simple algorithm for graph reconstruction;Random Structures & Algorithms;2023-02-16
4. Distributed Data-Driven Control of Network Systems;IEEE Open Journal of Control Systems;2023
5. Web Mining;Machine Learning for Data Science Handbook;2023