BFS-based distributed algorithm for parallel local-directed subgraph enumeration
Author:
Levinas Itay1,
Scherz Roy1,
Louzoun Yoram1
Affiliation:
1. Department of Mathematics, Bar-Ilan University , Ramat Gan, 5290000, Israel
Abstract
Abstract
Estimating the frequency of subgraphs is of importance for many tasks, including subgraph isomorphism, kernel-based anomaly detection and network structure analysis. While multiple algorithms were proposed for full enumeration or sampling-based estimates, these methods fail in very large graphs. Recent advances in parallelization allow for estimates of total subgraph counts in very large graphs. The task of counting the frequency of each subgraph associated with each vertex also received excellent solutions for undirected graphs. However, there is currently no good solution for very large directed graphs.
We here propose VDMC (Vertex specific Distributed Motif Counting)—a fully distributed algorithm to optimally count all the three and four vertices connected directed graphs (network motifs) associated with each vertex of a graph. VDMC counts each motif only once and its efficiency is linear in the number of counted motifs. It is fully parallelized to be efficient in GPU-based computation. VDMC is based on three main elements: (1) Ordering the vertices and only counting motifs containing increasing order vertices; (2) sub-ordering motifs based on the average depth of the tree spanning them via a BFS traversal; and (3) removing isomorphisms only once for the entire graph. We here compare VDMC to analytical estimates of the expected number of motifs in Erdős–Rényi graphs and show its accuracy. VDMC is available as a highly efficient CPU and GPU code with a novel data structure for efficient graph manipulation. We show the efficacy of VDMC on real-world graphs. VDMC allows for the precise analysis of subgraph frequency around each vertex in large graphs and opens the way for the extension of methods until now limited to graphs of thousands of edges to graphs with millions of edges and above.
GIT: https://github.com/louzounlab/graph-measures/
PyPI: https://pypi.org/project/graph-measures/
Publisher
Oxford University Press (OUP)
Subject
Applied Mathematics,Computational Mathematics,Control and Optimization,Management Science and Operations Research,Computer Networks and Communications
Reference52 articles.
1. Identification of large disjoint motifs in biological networks;Elhesha,;BMC Bioinformatics,2016
2. An algorithm for subgraph isomorphism;Ullmann,;J. ACM,,1976
3. GAIA: graph classification using evolutionary computation;Jin,;Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data,2010
4. Web graph similarity for anomaly detection;Papadimitriou,;J. Internet Serv. Appl.,,2010
5. Algorithm 457: finding all cliques of an undirected graph;Bron,;Commun. ACM,1973
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献