Affiliation:
1. Faculty of Computer Science, China University of Geosciences, Wuhan, Hubei 430074, P. R. China
2. Faculty of Mathematics, China University of Geosciences, Wuhan, Hubei 430074, P. R. China
Abstract
Traditionally, the performance of a classifier is measured by its classification accuracy or error rate. In fact, probability-based classifiers also produce the class probability estimation (the probability that a test instance belongs to the predicted class). This information is often ignored in classification, as long as the class with the highest class probability estimation is identical to the actual class. In many data mining applications, however, classification accuracy and error rate are not enough. For example, in direct marketing, we often need to deploy different promotion strategies to customers with different likelihood (class probability) of buying some products. Thus, accurate class probability estimations are often required to make optimal decisions. In this paper, we firstly review some state-of-the-art probability-based classifiers and empirically investigate their class probability estimation performance. From our experimental results, we can draw a conclusion: C4.4 is an attractive algorithm for class probability estimation. Then, we present a locally weighted version of C4.4 to scale up its class probability estimation performance by combining locally weighted learning with C4.4. We call our improved algorithm locally weighted C4.4, simply LWC4.4. We experimentally test LWC4.4 using the whole 36 UCI data sets selected by Weka. The experimental results show that LWC4.4 significantly outperforms C4.4 in terms of class probability estimation.
Publisher
World Scientific Pub Co Pte Lt
Subject
Artificial Intelligence,Computer Vision and Pattern Recognition,Software
Cited by
11 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献