Abstract
We design and implement a distributed algorithm for balanced
k
-way hypergraph partitioning that minimizes fanout, a fundamental hypergraph quantity also known as the communication volume and (
k
- 1)-cut metric, by optimizing a novel objective called
probabilistic fanout.
This choice allows a simple local search heuristic to achieve comparable solution quality to the best existing hypergraph partitioners.
Our algorithm is arbitrarily scalable due to a careful design that controls computational complexity, space complexity, and communication. In practice, we commonly process hypergraphs with billions of vertices and hyperedges in a few hours. We explain how the algorithm's scalability, both in terms of hypergraph size and bucket count, is limited only by the number of machines available. We perform an extensive comparison to existing distributed hypergraph partitioners and find that our approach is able to optimize hypergraphs roughly 100 times bigger on the same set of machines.
We call the resulting tool
Social Hash Partitioner
, and accompanying this paper, we open-source the most scalable version based on recursive bisection.
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Cited by
36 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献