Affiliation:
1. New Jersey Institute of Technology, USA
2. Drexel University, USA
3. University of Missouri, USA
Abstract
As an unsupervised learning process, document clustering has been used to improve information retrieval performance by grouping similar documents and to help text mining approaches by providing a high-quality input for them. In this article, the authors propose a novel hybrid clustering technique that incorporates semantic smoothing of document models into a neural network framework. Recently, it has been reported that the semantic smoothing model enhances the retrieval quality in Information Retrieval (IR). Inspired by that, the authors developed and applied a context-sensitive semantic smoothing model to boost accuracy of clustering that is generated by a dynamic growing cell structure algorithm, a variation of the neural network technique. They evaluated the proposed technique on biomedical article sets from MEDLINE, the largest biomedical digital library in the world. Their experimental evaluations show that the proposed algorithm significantly improves the clustering quality over the traditional clustering techniques including k-means and self-organizing map (SOM).
Subject
Hardware and Architecture,Software
Reference36 articles.
1. Aggarwal, C. C., Wolf, J. L., Yu, P. S., Procopiuc, C., & Park, J. S. (1999). Fast algorithms for projected clustering. Proceedings of the 1999 ACM SIGMOD International Conference on Management of data (pp. 61-72).
2. Allen, P. O. R., & Littman, M. (1993). An interface for navigating clustered document sets returned by queries. In Proceedings of the ACM Conference on Organizational Computing Systems (pp. 166-171).
3. Beil, F., Ester, M., & Xu, X. (2002, July 23-26). Frequent Term-Based Text Clustering. In Proceedings of 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 436-442). Edmonton, Alberta, Canada.
4. Bruske, J., & Sommer, G. (1995). Dynamic cell structures. In G. Tesauro, D. Touretzky, & T. Leen (Eds.), Advances in Neural Information Processing Systems, 7, 497–504. The MIT Press.
5. Buckley, C., & Lewit, A. F. (1985). Optimization of inverted vector searches. In Proceedings of SIGIR-85 (pp. 97-110).
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献