Author:
Gösgens Martijn,van der Hofstad Remco,Litvak Nelly
Abstract
We present the class of projection methods for community detection that generalizes many popular community detection methods. In this framework, we represent each clustering (partition) by a vector on a high-dimensional hypersphere. A community detection method is a projection method if it can be described by the following two-step approach: 1) the graph is mapped to a query vector on the hypersphere; and 2) the query vector is projected on the set of clustering vectors. This last projection step is performed by minimizing the distance between the query vector and the clustering vector, over the set of clusterings. We prove that optimizing Markov stability, modularity, the likelihood of planted partition models and correlation clustering fit this framework. A consequence of this equivalence is that algorithms for each of these methods can be modified to perform the projection step in our framework. In addition, we show that these different methods suffer from the same granularity problem: they have parameters that control the granularity of the resulting clustering, but choosing these to obtain clusterings of the desired granularity is nontrivial. We provide a general heuristic to address this granularity problem, which can be applied to any projection method. Finally, we show how, given a generator of graphs with community structure, we can optimize a projection method for this generator in order to obtain a community detection method that performs well on this generator.
Funder
Nederlandse Organisatie voor Wetenschappelijk Onderzoek