Author:
Hung Chihli,Wermter Stefan
Abstract
PurposeThe purpose of this paper is to examine neural document clustering techniques, e.g. self‐organising map (SOM) or growing neural gas (GNG), usually assume that textual information is stationary on the quantity.Design/methodology/approachThe authors propose a novel dynamic adaptive self‐organising hybrid (DASH) model, which adapts to time‐event news collections not only to the neural topological structure but also to its main parameters in a non‐stationary environment. Based on features of a time‐event news collection in a non‐stationary environment, they review the main current neural clustering models. The main deficiency is a need of pre‐definition of the thresholds of unit‐growing and unit‐pruning. Thus, the dynamic adaptive self‐organising hybrid (DASH) model is designed for a non‐stationary environment.FindingsThe paper compares DASH with SOM and GNG based on an artificial jumping corner data set and a real world Reuters news collection. According to the experimental results, the DASH model is more effective than SOM and GNG for time‐event document clustering.Practical implicationsA real world environment is dynamic. This paper provides an approach to present news clustering in a non‐stationary environment.Originality/valueText clustering in a non‐stationary environment is a novel concept. The paper demonstrates DASH, which can deal with a real world data set in a non‐stationary environment.
Subject
Library and Information Sciences,Computer Science Applications
Reference25 articles.
1. Ahrns, I., Bruske, J. and Sommer, G. (1995), “On‐line learning with dynamic cell structures”, Proceedings of ICANN‐95, the International Conference on Artificial Neural Networks, pp. 141‐6.
2. Blackmore, J. and Miikkulainen, R. (1993), “Incremental grid growing: encoding high‐dimensional structure into a two‐dimensional feature map”, Proceedings of the IEEE International Conference on Neural Networks (ICNN'93).
3. Chakrabarti, S. (2000), “Data mining for hypertext: a tutorial survey”, ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) Explorations, Vol. 1 No. 2, pp. 1‐11.
4. Chang, C.‐C. and Chen, R.‐S. (2006), “Using data mining technology to solve classification problems – a case study of campus digital library”, The Electronic Library, Vol. 24 No. 3, pp. 307‐21.
5. Chen, A.‐P. and Chen, C.‐C. (2006), “A new efficient approach for data clustering in electronic library using ant colony clustering algorithm”, The Electronic Library, Vol. 24 No. 4, pp. 548‐59.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献