Abstract
AbstractData clustering, local pattern mining, and community detection in graphs are three mature areas of data mining and machine learning. In recent years, attributed subgraph mining has emerged as a new powerful data mining task in the intersection of these areas. Given a graph and a set of attributes for each vertex, attributed subgraph mining aims to find cohesive subgraphs for which (some of) the attribute values have exceptional values. The principled integration of graph and attribute data poses two challenges: (1) the definition of a pattern syntax (the abstract form of patterns) that is intuitive and lends itself to efficient search, and (2) the formalization of the interestingness of such patterns. We propose an integrated solution to both of these challenges. The proposed pattern syntax improves upon prior work in being both highly flexible and intuitive. Plus, we define an effective and principled algorithm to enumerate patterns of this syntax. The proposed approach for quantifying interestingness of these patterns is rooted in information theory, and is able to account for background knowledge on the data. While prior work quantified the interestingness for the cohesion of the subgraph and for the exceptionality of its attributes separately, then combining these in a parameterized trade-off, we instead handle this trade-off implicitly in a principled, parameter-free manner. Empirical results confirm we can efficiently find highly interesting subgraphs.
Publisher
Springer Science and Business Media LLC
Subject
Computer Networks and Communications,Computer Science Applications,Information Systems
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Mining Java Memory Errors using Subjective Interesting Subgroups with Hierarchical Targets;2023 IEEE International Conference on Data Mining Workshops (ICDMW);2023-12-04
2. Unexpected Attributed Subgraphs: a Mining Algorithm;Proceedings of the International Conference on Advances in Social Networks Analysis and Mining;2023-11-06
3. Polynomial-delay enumeration algorithms in set systems;Theoretical Computer Science;2023-06
4. Enumeration of Support-Closed Subsets in Confluent Systems;Algorithmica;2022-01-24
5. GraphAnoGAN: Detecting Anomalous Snapshots from Attributed Graphs;Machine Learning and Knowledge Discovery in Databases. Research Track;2021