Affiliation:
1. The Chinese University of Hong, Kong, Shenzhen
2. Hong Kong Baptist University
Abstract
In the heterogeneous information network (HIN), a motif-clique is a "complete graph" for a given motif (or a small connected graph) that could capture the desired relationship in the motif. The maximal motif-cliques of HINs have found various applications in community discovery, recommendation, and biological network analysis. The state-of-the-art algorithm for enumerating maximal motif-cliques may have to explore all possible subgraphs of a maximal motif-clique and check whether a maximal motif-clique has been enumerated at each recursive step, which is very time-consuming. To improve the efficiency of enumeration, in this paper, we develop efficient algorithms for maximal motif-clique enumeration over large HINs. We first introduce an order-based framework to avoid duplicated enumeration, which results in lower time complexity compared to the existing algorithm. We then propose a pivot-based pruning strategy, which significantly reduces the search space. We further optimize the process of identifying the candidate sets and locating the subgraphs containing the maximal motif-cliques. Extensive experiments on five real-world HINs demonstrate that our proposed algorithm achieves high efficiency and is up to three orders of magnitude faster than the state-of-the-art algorithm.
Publisher
Association for Computing Machinery (ACM)