Author:
Jelenković Predrag R.,Olvera-Cravioto Mariana
Abstract
In this paper we consider the stochastic analysis of information ranking algorithms of large interconnected data sets, e.g. Google's PageRank algorithm for ranking pages on the World Wide Web. The stochastic formulation of the problem results in an equation of the formwhereN,Q, {Ri}i≥1, and {C,Ci}i≥1are independent nonnegative random variables, the {C,Ci}i≥1are identically distributed, and the {Ri}i≥1are independent copies ofstands for equality in distribution. We study the asymptotic properties of the distribution ofRthat, in the context of PageRank, represents the frequencies of highly ranked pages. The preceding equation is interesting in its own right since it belongs to a more general class of weighted branching processes that have been found to be useful in the analysis of many other algorithms. Our first main result shows that if ENE[Cα] = 1, α > 0, andQ,Nsatisfy additional moment conditions, thenRhas a power law distribution of index α. This result is obtained using a new approach based on an extension of Goldie's (1991) implicit renewal theorem. Furthermore, whenNis regularly varying of index α > 1, ENE[Cα] < 1, andQ,Chave higher moments than α, then the distributions ofRandNare tail equivalent. The latter result is derived via a novel sample path large deviation method for recursive random sums. Similarly, we characterize the situation when the distribution ofRis determined by the tail ofQ. The preceding approaches may be of independent interest, as they can be used for analyzing other functionals on trees. We also briefly discuss the engineering implications of our results.
Publisher
Cambridge University Press (CUP)
Subject
Applied Mathematics,Statistics and Probability
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献