Revisiting the Optimal Probability Estimator from Small Samples for Data Mining-Reference-Cited by-同舟云学术

Revisiting the Optimal Probability Estimator from Small Samples for Data Mining

Published:2019-12-01 Issue:4 Volume:29 Page:783-796
ISSN:2083-8492
Container-title:International Journal of Applied Mathematics and Computer Science
language:en
Short-container-title:

Author:

Cestnik Bojan¹²

Affiliation:

1. Department of Knowledge Technologies Jožef Stefan Institute , Jamova 39, 1000 Ljubljana , Slovenia

2. Temida d.o.o., Dunajska cesta 51, 1000 Ljubljana , Slovenia

Abstract

Abstract Estimation of probabilities from empirical data samples has drawn close attention in the scientific community and has been identified as a crucial phase in many machine learning and knowledge discovery research projects and applications. In addition to trivial and straightforward estimation with relative frequency, more elaborated probability estimation methods from small samples were proposed and applied in practice (e.g., Laplace’s rule, the m-estimate). Piegat and Landowski (2012) proposed a novel probability estimation method from small samples Ep h√2 that is optimal according to the mean absolute error of the estimation result. In this paper we show that, even though the articulation of Piegat’s formula seems different, it is in fact a special case of the m-estimate, where p a =1/2 and m = √2 . In the context of an experimental framework, we present an in-depth analysis of several probability estimation methods with respect to their mean absolute errors and demonstrate their potential advantages and disadvantages. We extend the analysis from single instance samples to samples with a moderate number of instances. We define small samples for the purpose of estimating probabilities as samples containing either less than four successes or less than four failures and justify the definition by analysing probability estimation errors on various sample sizes.

Publisher

Walter de Gruyter GmbH

Subject

Applied Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)

Link

https://www.sciendo.com/pdf/10.2478/amcs-2019-0058

Reference36 articles.

1. Berger, J.O. (1985). Statistical Decision Theory and Bayesian Analysis, Springer, New York, NY.

2. Bouguila, N. (2013). On the smoothing of multinomial estimates using Liouville mixture models and applications, Pattern Analysis and Applications16(3): 349–363.

3. Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.J. (1984). Classification and Regression Trees, Wadsworth, Belmont.

4. Calvo, B. and Santafé, G. (2016). SCMAMP: Statistical comparison of multiple algorithms in multiple problems, The R Journal8(1): 248–256.

5. Cestnik, B. (1990). Estimating probabilities: A crucial task in machine learning, Proceedings of the 9th European Conference on Artificial Intelligence, London, UK, pp. 147–149.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. On finding the optimal parameters for probability estimation with m-estimate;Proceedings of the 21st International Conference on Computer Systems and Technologies '20;2020-06-19

2. A Novel and Simple Mathematical Transform Improves the Perfomance of Lernmatrix in Pattern Classification;Mathematics;2020-05-06