Affiliation:
1. Chung Yuan Christian University
Abstract
Patents are distributed through hundreds of collections, divided up by general area. A hybrid classifier system thus can be a powerful solution to difficult patent classification problems. In this study, we present a system for classifying patent documents on a hybrid approach by combining multiple text classifiers (Naïve Bayes, KNN and Rocchio). Decisions made by various text classifiers can be combined by voting and sampling mechanisms in the system. A prototype system was developed and tested in a real world task. The results have indicated that the accuracy of the hybrid approach is more stable than that of any of the three individual text classifiers.
Publisher
Trans Tech Publications, Ltd.
Reference12 articles.
1. F. A. Barros, E. F. A. Silva, R. B. C. Prudencio, V. M. Filho, A. C. A. Nascimento: Combining Text Classifiers and Hidden Markov Models for Information Extraction. International Journal on Articial Intelligence Tools. Vol. 18(2) (2009).
2. P. Bennett, S. Dumais, E. Horvitz,: The Combination of Text Classifiers Using Reliability Indicators. Information Retrieval. Vol. 8, (2005), pp.67-100.
3. Lee, C. H.: A Study of Apply-ing Data Mining Classification Techniques to Patent Analysis. Master Thesis, Chung-Yuan Univ. (2003).
4. C. J. Fall, A. Torcsvari, K. Benzineb and G. Karetka: Automated Categorization in the International Patent Classification. ACM SIGIR Forum archive. Vol. 37(1), (2003) pp.10-25.
5. A. K. Khalid, A. Tyrrel, A. Vachher and T. Travers: Combining Multiple Classifiers for Text Categorization. CIKM2001. (2001) pp.5-10.