Author:
Takama Yasufumi, ,Tanaka Yuna,Mori Yoshiyuki,Shibata Hiroki
Abstract
This paper proposes Treemap-based visualization for supporting cluster analysis of multi-dimensional data. It is important to grasp data distribution in a target dataset for such tasks as machine learning and cluster analysis. When dealing with multi-dimensional data such as statistical data and document datasets, dimensionality reduction algorithms are usually applied to project original data to lower-dimensional space. However, dimensionality reduction tends to lose the characteristics of data in the original space. In particular, the border between different data groups could not be represented correctly in lower-dimensional space. To overcome this problem, the proposed visualization method applies Fuzzy c-Means to target data and visualizes the result on the basis of the highest and the second-highest membership values with Treemap. Visualizing the information about not only the closest clusters but also the second closest ones is expected to be useful for identifying objects around the border between different clusters, as well as for understanding the relationship between different clusters. A prototype interface is implemented, of which the effectiveness is investigated with a user experiment on a news articles dataset. As another kind of text data, a case study of applying it to a word embedding space is also shown.
Funder
Japan Society for the Promotion of Science
Publisher
Fuji Technology Press Ltd.
Subject
Artificial Intelligence,Computer Vision and Pattern Recognition,Human-Computer Interaction
Reference26 articles.
1. D. Sacha, L. Zhang, M. Sedlmair, J. A Lee, J. Peltonen, D. Weiskopf, S. C. North, and D. A Keim, “Visual Interaction with Dimensionality Reduction: A Structured Literature Analysis,” IEEE Trans. Visualization and Computer Graphics, Vol.23, No.1, pp. 241-250, 2017.
2. J. C. Bezdek, “Pattern Recognition with Fuzzy Objective Function Algorithms,” Plenum Press, 1981.
3. Y. Takama, Y. Mori, and H. Shibata, “Generation of Word Vectors for Unknown Words without Additional Corpus,” The 2020 IEEE/WIC/ACM Int. Joint Conf. on Web Intelligence and Intelligent Agent Technology (WI-IAT2020), No.WI257, 2020.
4. Y. Takama, Y. Tanaka, and H. Shibata, “Proposal of Treemap-Based Cluster Visualization and its Application to News Article Data,” The 9th Int. Symp. on Computational Intelligence and Industrial Applications (ISCIIA2020), No.1A-1-2-3, 2020.
5. J. Bae, T. Helldin, M. Riveiro, S. Nowaczyk, M.-R. Bouguelia, and G. Falkman, “Interactive Clustering: A Comprehensive Review,” ACM Computing Surveys, Vol.53, No.1, Article No.1, 2020.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Data mining of Chain convenience stores location;Applied Mathematics and Nonlinear Sciences;2022-05-31