Short Text Clustering Algorithm with Feature Keyword Expansion-Reference-Cited by-同舟云学术

Short Text Clustering Algorithm with Feature Keyword Expansion

Published:2012-06 Issue: Volume:532-533 Page:1716-1720
ISSN:1662-8985
Container-title:Advanced Materials Research
language:
Short-container-title:AMR

Author:

Jin Chun Xia¹,Zhou Hai Yan¹,Bai Qiu Chan¹

Affiliation:

1. Huaiyin Institute of Technology

Abstract

To solve the problem of sparse keywords and similarity drift in short text segments, this paper proposes short text clustering algorithm with feature keyword expansion (STCAFKE). The method can realize short text clustering by expanding feature keyword based on HowNet and combining K-means algorithm and density algorithm. It may add the number of text keyword with feature keyword expansion and increase text semantic features to realize short text clustering. Experimental results show that this algorithm has increased the short text clustering quality on precision and recall.

Publisher

Trans Tech Publications, Ltd.

Subject

General Engineering

Link

https://www.scientific.net/AMR.532-533.1716.pdf

Reference7 articles.

1. Carullo M, Binaghi E, Gallo I. An online document clustering technique for short web contents. Pattern Recognit Lett , 2009, 30(10), p.870–876.

2. Pinto D, Bened JM, Rosso P. Clustering narrow-domain short texts by using the Kullback-Leibler distance. In: Gelbukh A. (ed. ) CICLing 2007, LNCS, vol. 4394, p.611–622.

3. Liu Qun , Li SuJian. Word similarity computing based on HowNet. Computational Linguistics and Chinese Language Processing , 2002, 7 (2), pp.59-76.

4. Lin Li. Text clustering reseach based on semantic distance. Xiamen University Master thesis, 2007(4).

5. Wan Xiaojun. A novel document similarity measure based on earth mover's distance. Information Science, 2007, pp.3718-3730.

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Research on user generated content in Q&A system and online comments based on text mining;Alexandria Engineering Journal;2022-10

2. A patent keywords extraction method using TextRank model with prior public knowledge;Complex & Intelligent Systems;2021-03-29

3. State-of-art;Proceedings of the 4th International Conference on Communication and Information Processing - ICCIP '18;2018

4. Text mining and semantics: a systematic mapping study;Journal of the Brazilian Computer Society;2017-06-29

5. An Improved Feature Selection Method for Chinese Short Texts Clustering Based on HowNet;Lecture Notes in Electrical Engineering;2013-12-05