Affiliation:
1. Beijing Institute of Technology
2. China Youth University for Political Sciences
Abstract
In this paper, we propose new keywords extraction method based on texts classification. We first classify texts to determine their categories. Then determine weights of candidate words according to both their frequency and the relevance between text words and text category. Finally, keywords are extracted by sorting weights of candidate words. We conduct this experiment to show that on the premise of accurate text classification, this method can extract keywords effectively from text without title or with deviated title which can not reflect texts subject. Objective selecting of candidate word weighting function still needs to be further researched.
Publisher
Trans Tech Publications, Ltd.
Reference6 articles.
1. Brook Wu Yi-fang, Li Quan-zhi, Razvan Stefan Bot, et a1. KIP: a keyphrase identification program with learning functions[C]. Proceedings of the International Conference on Information Technology: Coding and Computing(ITCC'04), (2004).
2. Hulth A. Improved automatic keyword extraction given more linguistic knowledge[C]. Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, 2003: 216-223.
3. Tumey P. Learning to extract key phrases from text, NRC/ERB-1057[R], 1999, 02, 17.
4. Yang Wen-Feng. Chinese keyword extraction based on max2dup licated strings of the documents[A ]. In: Proceedings of the 25 th Annual InternationalACM SIGIR Conference on Research and Development in Information Retrieval[C ] , Tampere, Finland, 2002: 439 - 440.
5. ZHENG Jia-heng, LU Jiaoli . Study of An Improved Keywords Distillation Method[J]. Computer Engineering , 2005, 31(18): 194-196 (in Chinese).
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Changes in society and language;Studies in Corpus Linguistics;2020-04-15