Affiliation:
1. Logistical Engineering University
Abstract
A method using the HowNet ontologies for short texts classification was proposed. First, the domain high frequency words were got as the feature words. Then the feature words were extended to concept by HowNet, which extended the feature from semantic and amends the feature scarcity. Last, the word semantic correlation values were got by calculating the distance between different concepts in node tree. Experimental results prove that the classification efficiency and precision are both improved.
Publisher
Trans Tech Publications, Ltd.
Reference11 articles.
1. Fabrizio SebastianiI Machine Learning in Automated Text Categorization Consiglio Nazionale delle Ricerche, Italy. ACM Computing Surveys, Vol. 34, No. 1, 2002, p.1–47.
2. Xinghua Fan, Maosong Sun. A High Performance Two-Class Chinese Text Categorization Method. Department of Computer Science and Technology, TsingHua University. Chinese Journal of Computers, Vol. 29, No. 1, 2006, pp.124-131. In Chinese.
3. Zelikovitz, S. and Marquez ,F. Transductive Learning for Short-Text Classification Problem using Late-nt Semantic. Indexing International Journal of Pattern Recognition and Artificial Intelligence, 19(2), 143-163, (2005).
4. Qiang Pu, Guo Wei Yang Short-Text Classification Based on ICA and LSA. Proceedings of International Symposium on Neural Networks 2006(ISNN 2), 265-270, (2006).
5. Xiwei Wang,Xinghua Fan. Method for Chinese short text classification based on feature extension. Journal of Computer Application,Vol. 29, No. 3, 2009, p.843–845. In Chinese.
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献