Affiliation:
1. The University of Texas at Austin, Austin, TX, USA
Abstract
Graph analytics systems must analyze graphs with billions of vertices and edges which require several terabytes of storage. Distributed-memory clusters are often used for analyzing such large graphs since the main memory of a single machine is usually restricted to a few hundreds of gigabytes. This requires partitioning the graph among the machines in the cluster. Existing graph analytics systems use a built-in partitioner that incorporates a particular partitioning policy, but the best policy is dependent on the algorithm, input graph, and platform. Therefore, built-in partitioners are not sufficiently flexible. Stand-alone graph partitioners are available, but they too implement only a few policies.
CuSP is a fast streaming edge partitioning framework which permits users to specify the desired partitioning policy at a high level of abstraction and quickly generates highquality graph partitions. For example, it can partition wdc12, the largest publicly available web-crawl graph with 4 billion vertices and 129 billion edges, in under 2 minutes for clusters with 128 machines. Our experiments show that it can produce quality partitions 6× faster on average than the state-of-theart stand-alone partitioner in the literature while supporting a wider range of partitioning policies.
Publisher
Association for Computing Machinery (ACM)
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. An adaptive graph sampling framework for graph analytics;Social Network Analysis and Mining;2023-12-06
2. RAGraph: A Region-Aware Framework for Geo-Distributed Graph Processing;Proceedings of the VLDB Endowment;2023-11
3. RepCut: Superlinear Parallel RTL Simulation with Replication-Aided Partitioning;Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3;2023-03-25