Abstract
AbstractRoleSim and SimRank are among the popular graph-theoretic similarity measures with many applications in, e.g., web search, collaborative filtering, and sociometry. While RoleSim addresses the automorphic (role) equivalence of pairwise similarity which SimRank lacks, it ignores the neighboring similarity information out of the automorphically equivalent set. Consequently, two pairs of nodes, which are not automorphically equivalent by nature, cannot be well distinguished by RoleSim if the averages of their neighboring similarities over the automorphically equivalent set are the same. To alleviate this problem: 1) We propose a novel similarity model, namely RoleSim*, which accurately evaluates pairwise role similarities in a more comprehensive manner. RoleSim* not only guarantees the automorphic equivalence that SimRank lacks, but also takes into account the neighboring similarity information outside the automorphically equivalent sets that are overlooked by RoleSim. 2) We prove the existence and uniqueness of the RoleSim* solution, and show its three axiomatic properties (i.e., symmetry, boundedness, and non-increasing monotonicity). 3) We provide a concise bound for iteratively computing RoleSim* formula, and estimate the number of iterations required to attain a desired accuracy. 4) We induce a distance metric based on RoleSim* similarity, and show that the RoleSim* metric fulfills the triangular inequality, which implies the sum-transitivity of its similarity scores. 5) We present a threshold-based RoleSim* model that reduces the computational time further with provable accuracy guarantee. 6) We propose a single-source RoleSim* model, which scales well for sizable graphs. 7) We also devise methods to scale RoleSim* based search by incorporating its triangular inequality property with partitioning techniques. Our experimental results on real datasets demonstrate that RoleSim* achieves higher accuracy than its competitors while scaling well on sizable graphs with billions of edges.
Funder
National Natural Science Foundation of China
Natural Science Foundation of Jiangsu Province
Publisher
Springer Science and Business Media LLC
Subject
Computer Networks and Communications,Hardware and Architecture,Software
Reference42 articles.
1. Antonellis, I., Garcia-Molina, H., Chang, C.-C.: SimRank++: Query rewriting through link analysis of the click graph. PVLDB, 1(1) (2008)
2. Bijsterbosch, J., Volgenant, A.: Solving the rectangular assignment problem and applications. Ann. Oper. Res. 181(1), 443–462 (2010)
3. Chen, H., Giles, C.L.: ASCOS++: an asymmetric similarity measure for weighted networks to address the problem of simrank. ACM Trans. Knowl. Discov. Data 10(2), 15:1–15:26 (2015)
4. Fujiwara, Y., Nakatsuji, M., Shiokawa, H., Onizuka, M.: Efficient search algorithm for SimRank. In: ICDE, pp 589–600 (2013)
5. He, G., Feng, H., Li, C., Chen, H.: Parallel SimRank computation on large graphs with iterative aggregation. In: KDD (2010)
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献