Neural Attributed Community Search at Billion Scale

Author:

Wang Jianwei1,Wang Kai2,Lin Xuemin2,Zhang Wenjie1,Zhang Ying3

Affiliation:

1. University of New South Wales, Australia

2. Shanghai Jiao Tong University, China

3. University of Technology Sydney, Australia

Abstract

Community search has been extensively studied in the past decades. In recent years, there is a growing interest in attributed community search that aims to identify a community based on both the query nodes and query attributes. A set of techniques have been investigated. Though the recent methods based on advanced learning models such as graph neural networks (GNNs) can achieve state-of-the-art performance in terms of accuracy, we notice that 1) they suffer from severe efficiency issues; 2) they directly model community search as a node classification problem and thus cannot make good use of interdependence among different entities in the graph. Motivated by these, in this paper, we propose a new neur AL attr I buted Community s E arch model for large-scale graphs, termed ALICE. ALICE first extracts a candidate subgraph to reduce the search scope and subsequently predicts the community by the Consistency-aware Net, termed ConNet. Specifically, in the extraction phase, we introduce the density sketch modularity that uses a unified form to combine the strengths of two existing powerful modularities, i.e., classical modularity and density modularity. Based on the new modularity metric, we first adaptively obtain the candidate subgraph, formed by the k-hop neighbors of the query nodes, with the maximum modularity. Then, we construct a node-attribute bipartite graph to take attributes into consideration. After that, ConNet adopts a cross-attention encoder to encode the interaction between the query and the graph. The training of the model is guided by the structure-attribute consistency and the local consistency to achieve better performance. Extensive experiments over 11 real-world datasets including one billion-scale graph demonstrate the superiority of ALICE in terms of accuracy, efficiency, and scalability. Notably, ALICE can improve the F1-score by 10.18% on average and is more efficient on large datasets in comparison to the state-of-the-art. ALICE can finish training on the billion-scale graph within a reasonable time whereas state-of-the-art can not.

Publisher

Association for Computing Machinery (ACM)

Reference55 articles.

1. Martín Arjovsky and Léon Bottou. 2017. Towards Principled Methods for Training Generative Adversarial Networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=Hk4_qw5xe

2. Martin Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein generative adversarial networks. In International conference on machine learning. PMLR, 214--223.

3. arXiv.org submitters. 2023. arXiv Dataset. https://doi.org/10.34740/KAGGLE/DSV/6101996

4. Modularity and community detection in bipartite networks

5. Clustering and Summarizing Protein-Protein Interaction Networks: A Survey

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Querying Historical Cohesive Subgraphs Over Temporal Bipartite Graphs;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

2. Efficient Unsupervised Community Search with Pre-Trained Graph Transformer;Proceedings of the VLDB Endowment;2024-05

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3