Abstract
AbstractClustering is an important task in knowledge discovery with the goal to identify structures of similar data points in a dataset. Here, the focus lies on methods that use a human-in-the-loop, i.e., incorporate user decisions into the clustering process through 2D and 3D displays of the structures in the data. Some of these interactive approaches fall into the category of visual analytics and emphasize the power of such displays to identify the structures interactively in various types of datasets or to verify the results of clustering algorithms. This work presents a new method called interactive projection-based clustering (IPBC). IPBC is an open-source and parameter-free method using a human-in-the-loop for an interactive 2.5D display and identification of structures in data based on the user’s choice of a dimensionality reduction method. The IPBC approach is systematically compared with accessible visual analytics methods for the display and identification of cluster structures using twelve clustering benchmark datasets and one additional natural dataset. Qualitative comparison of 2D, 2.5D and 3D displays of structures and empirical evaluation of the identified cluster structures show that IPBC outperforms comparable methods. Additionally, IPBC assists in identifying structures previously unknown to domain experts in an application.
Funder
Philipps-Universität Marburg
Publisher
Springer Science and Business Media LLC
Subject
Applied Mathematics,Computational Theory and Mathematics,Computer Science Applications,Modeling and Simulation,Information Systems
Reference80 articles.
1. Cook, K.A., Thomas, J.J.: Illuminating the Path: The Research and Development Agenda for Visual Analytics. PNNL, Richland (2005)
2. Keim, D.A., Mansmann, F., Thomas, J.: Visual analytics: how much visualization and how much analytics? ACM SIGKDD Explorations Newslett. 11, 5–8 (2010)
3. Chen, K., Liu, L.: VISTA: validating and refining clusters via visualization. Inf. Vis. 3, 257–270 (2004)
4. Venna, J., Peltonen, J., Nybo, K., Aidos, H., Kaski, S.: Information retrieval perspective to nonlinear dimensionality reduction for data visualization. J. Mach. Learn. Res. 11, 451–490 (2010)
5. Mirkin, B.G.: Clustering: A Data Recovery Approach. CRC Press, Boca Raton, FL (2005)
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献