ClipSim: A GPU-friendly Parallel Framework for Single-Source SimRank with Accuracy Guarantee

Author:

Wu Tianhao1ORCID,Cheng Ji2ORCID,Zhang Chaorui3ORCID,Hou Jianfeng3ORCID,Chen Gengjian4ORCID,Huang Zhongyi1ORCID,Zhang Weixi3ORCID,Han Wei3ORCID,Bai Bo3ORCID

Affiliation:

1. Tsinghua University, Beijing, China

2. Hong Kong University of Science and Technology, Hong Kong, China

3. Theory Lab, 2012 Labs & Huawei Technologies, Co. Ltd, Hong Kong, China

4. Wuhan University, Wuhan, China

Abstract

SimRank is an important metric to measure the topological similarity between two nodes in a graph. In particular, single-source and top-k SimRank has numerous applications in recommendation systems, network analysis, and web mining, etc. Mathematically, given a vertex, the computation of single-machine and single-source SimRank mainly lies in matrix-matrix operations. However, it is almost impossible to directly compute on large graphs. Thus, existing works yield to two main operations: a series of random walks, and sparse matrix and dense vector multiplication operations. This brings about high computation cost for SimRank on large graphs. In real-world applications, there is always the query time and accuracy trade-off, which hinders the computation of high-precision SimRank on large-scale graphs. To handle this problem, this paper proposesClipSim, the first GPU-friendly parallel framework that accelerates the single-source SimRank on GPU with accuracy guarantee. We design a novel data structure and GPU-friendly parallel algorithms for efficient computation of all the operations of SimRank on GPU. Moreover, our theoretical derivation enables ClipSim to largely reduce the number of random walks required for each node, while maintaining the same theoretical accuracy as the state-of-the-art algorithm, ExactSim. We conduct extensive experiments on real-world and synthetic datasets to demonstrate the accuracy and efficiency of ClipSim. The results show that compared with ExactSim, ClipSim obtains single-source SimRank vectors with the same accuracy and up to 160× faster computation time.

Funder

National Natural Science Foundation of China

Publisher

Association for Computing Machinery (ACM)

Reference39 articles.

1. Simrank++

2. Piotr Bialas and Adam Strzelecki . 2015 . Benchmarking the cost of thread divergence in CUDA . In International Conference on Parallel Processing and Applied Mathematics. Springer , Krakow, Poland, 570--579. Piotr Bialas and Adam Strzelecki. 2015. Benchmarking the cost of thread divergence in CUDA. In International Conference on Parallel Processing and Applied Mathematics. Springer, Krakow, Poland, 570--579.

3. On the representation and multiplication of hypersparse matrices

4. Rohit Chandra Leo Dagum David Kohr Ramesh Menon Dror Maydan and Jeff McDonald. 2001. Parallel programming in OpenMP. Morgan kaufmann. Rohit Chandra Leo Dagum David Kohr Ramesh Menon Dror Maydan and Jeff McDonald. 2001. Parallel programming in OpenMP. Morgan kaufmann.

5. An elementary proof of the strong law of large numbers

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3