Revisiting the Optimal Probability Estimator from Small Samples for Data Mining

Author:

Cestnik Bojan12

Affiliation:

1. Department of Knowledge Technologies Jožef Stefan Institute , Jamova 39, 1000 Ljubljana , Slovenia

2. Temida d.o.o., Dunajska cesta 51, 1000 Ljubljana , Slovenia

Abstract

Abstract Estimation of probabilities from empirical data samples has drawn close attention in the scientific community and has been identified as a crucial phase in many machine learning and knowledge discovery research projects and applications. In addition to trivial and straightforward estimation with relative frequency, more elaborated probability estimation methods from small samples were proposed and applied in practice (e.g., Laplace’s rule, the m-estimate). Piegat and Landowski (2012) proposed a novel probability estimation method from small samples Ep h√2 that is optimal according to the mean absolute error of the estimation result. In this paper we show that, even though the articulation of Piegat’s formula seems different, it is in fact a special case of the m-estimate, where p a =1/2 and m = √2 . In the context of an experimental framework, we present an in-depth analysis of several probability estimation methods with respect to their mean absolute errors and demonstrate their potential advantages and disadvantages. We extend the analysis from single instance samples to samples with a moderate number of instances. We define small samples for the purpose of estimating probabilities as samples containing either less than four successes or less than four failures and justify the definition by analysing probability estimation errors on various sample sizes.

Publisher

Walter de Gruyter GmbH

Subject

Applied Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)

Reference36 articles.

1. Berger, J.O. (1985). Statistical Decision Theory and Bayesian Analysis, Springer, New York, NY.

2. Bouguila, N. (2013). On the smoothing of multinomial estimates using Liouville mixture models and applications, Pattern Analysis and Applications16(3): 349–363.

3. Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.J. (1984). Classification and Regression Trees, Wadsworth, Belmont.

4. Calvo, B. and Santafé, G. (2016). SCMAMP: Statistical comparison of multiple algorithms in multiple problems, The R Journal8(1): 248–256.

5. Cestnik, B. (1990). Estimating probabilities: A crucial task in machine learning, Proceedings of the 9th European Conference on Artificial Intelligence, London, UK, pp. 147–149.

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. On finding the optimal parameters for probability estimation with m-estimate;Proceedings of the 21st International Conference on Computer Systems and Technologies '20;2020-06-19

2. A Novel and Simple Mathematical Transform Improves the Perfomance of Lernmatrix in Pattern Classification;Mathematics;2020-05-06

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3