Affiliation:
1. University of Illinois at Urbana-Champaign
2. University of California at Santa Barbara
3. University of Illinois at Chicago and King Abdulaziz University
Abstract
Real-world, multiple-typed objects are often interconnected, forming heterogeneous information networks. A major challenge for link-based clustering in such networks is their potential to generate many different results, carrying rather diverse semantic meanings. In order to generate desired clustering, we propose to use
meta-path
, a path that connects object types via a sequence of relations, to control clustering with distinct semantics. Nevertheless, it is easier for a user to provide a few examples (seeds) than a weighted combination of sophisticated meta-paths to specify her clustering preference. Thus, we propose to integrate
meta-path selection
with
user-guided clustering
to cluster objects in networks, where a user first provides a small set of object seeds for each cluster as guidance. Then the system learns the weight for each meta-path that is consistent with the clustering result implied by the guidance, and generates clusters under the learned weights of meta-paths. A probabilistic approach is proposed to solve the problem, and an effective and efficient iterative algorithm,
PathSelClus
, is proposed to learn the model, where the clustering quality and the meta-path weights mutually enhance each other. Our experiments with several clustering tasks in two real networks and one synthetic network demonstrate the power of the algorithm in comparison with the baselines.
Funder
U.S. Army Research Laboratory
Division of Information and Intelligent Systems
Air Force Office of Scientific Research
Engineering and Physical Sciences Research Council
University of Illinois at Urbana-Champaign
Publisher
Association for Computing Machinery (ACM)
Cited by
131 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献