Abstract
AbstractIn data science, an unknown information source is estimated by a predictive distribution defined from a statistical model and a prior. In an older Bayesian framework, it was explained that the Bayesian predictive distribution should be the best on the assumption that a statistical model is convinced to be correct and a prior is given by a subjective belief in a small world. However, such a restricted treatment of Bayesian inference cannot be applied to highly complicated statistical models and learning machines in a large world. In 1980, a new scientific paradigm of Bayesian inference was proposed by Akaike, in which both a model and a prior are candidate systems and they had better be designed by mathematical procedures so that the predictive distribution is the better approximation of unknown information source. Nowadays, Akaike’s proposal is widely accepted in statistics, data science, and machine learning. In this paper, in order to establish a mathematical foundation for developing a measure of a statistical model and a prior, we show the relation among the generalization loss, the information criteria, and the cross-validation loss, then compare them from three different points of view. First, their performances are compared in singular problems where the posterior distribution is far from any normal distribution. Second, they are studied in the case when a leverage sample point is contained in data. And last, their stochastic properties are clarified when they are used for the prior optimization problem. The mathematical and experimental comparison shows the equivalence and the difference among them, which we expect useful in practical applications.
Publisher
Springer Science and Business Media LLC
Subject
Computational Theory and Mathematics,Statistics and Probability
Reference39 articles.
1. Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control. 19,60
2. Akaike, H. (1980). Likelihood and Bayes procedure. Bayesian Statistics. 143–166.
3. Akaike, H. (1980). On the transition of the paradigm of statistical inference. The proceedings of the Institute of Statistical Mathematics, 27, 5–12.
4. Antonia Amaral Turkman, M., Carlos Daniel, P., & Peter, M. (2019). Computational Bayesian statistics, Cambridge University Press
5. Aoyagi, M., & Watanabe, S. (2005). Stochastic complexities of reduced rank regression in Bayesian estimation. Neural Networks, 18, 924–933.
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献