Neural Attributed Community Search at Billion Scale-Reference-Cited by-同舟云学术

Neural Attributed Community Search at Billion Scale

Published:2023-12-08 Issue:4 Volume:1 Page:1-25
ISSN:2836-6573
Container-title:Proceedings of the ACM on Management of Data
language:en
Short-container-title:Proc. ACM Manag. Data

Author:

Wang Jianwei¹,Wang Kai²,Lin Xuemin²,Zhang Wenjie¹,Zhang Ying³

Affiliation:

1. University of New South Wales, Australia

2. Shanghai Jiao Tong University, China

3. University of Technology Sydney, Australia

Abstract

Community search has been extensively studied in the past decades. In recent years, there is a growing interest in attributed community search that aims to identify a community based on both the query nodes and query attributes. A set of techniques have been investigated. Though the recent methods based on advanced learning models such as graph neural networks (GNNs) can achieve state-of-the-art performance in terms of accuracy, we notice that 1) they suffer from severe efficiency issues; 2) they directly model community search as a node classification problem and thus cannot make good use of interdependence among different entities in the graph. Motivated by these, in this paper, we propose a new neur AL attr I buted Community s E arch model for large-scale graphs, termed ALICE. ALICE first extracts a candidate subgraph to reduce the search scope and subsequently predicts the community by the Consistency-aware Net, termed ConNet. Specifically, in the extraction phase, we introduce the density sketch modularity that uses a unified form to combine the strengths of two existing powerful modularities, i.e., classical modularity and density modularity. Based on the new modularity metric, we first adaptively obtain the candidate subgraph, formed by the k-hop neighbors of the query nodes, with the maximum modularity. Then, we construct a node-attribute bipartite graph to take attributes into consideration. After that, ConNet adopts a cross-attention encoder to encode the interaction between the query and the graph. The training of the model is guided by the structure-attribute consistency and the local consistency to achieve better performance. Extensive experiments over 11 real-world datasets including one billion-scale graph demonstrate the superiority of ALICE in terms of accuracy, efficiency, and scalability. Notably, ALICE can improve the F1-score by 10.18% on average and is more efficient on large datasets in comparison to the state-of-the-art. ALICE can finish training on the billion-scale graph within a reasonable time whereas state-of-the-art can not.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3626738

Reference55 articles.

1. Martín Arjovsky and Léon Bottou. 2017. Towards Principled Methods for Training Generative Adversarial Networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=Hk4_qw5xe

2. Martin Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein generative adversarial networks. In International conference on machine learning. PMLR, 214--223.

3. arXiv.org submitters. 2023. arXiv Dataset. https://doi.org/10.34740/KAGGLE/DSV/6101996

4. Modularity and community detection in bipartite networks

5. Clustering and Summarizing Protein-Protein Interaction Networks: A Survey

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Querying Historical Cohesive Subgraphs Over Temporal Bipartite Graphs;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

2. Efficient Unsupervised Community Search with Pre-Trained Graph Transformer;Proceedings of the VLDB Endowment;2024-05