Abstract
AbstractThis paper investigates the problem of simultaneously predicting multiple binary responses by utilizing a shared set of covariates. Our approach incorporates machine learning techniques for binary classification, without making assumptions about the underlying observations. Instead, our focus lies on a group of predictors, aiming to identify the one that minimizes prediction error. Unlike previous studies that primarily address estimation error, we directly analyze the prediction error of our method using PAC-Bayesian bounds techniques. In this paper, we introduce a pseudo-Bayesian approach capable of handling incomplete response data. Our strategy is efficiently implemented using the Langevin Monte Carlo method. Through simulation studies and a practical application using real data, we demonstrate the effectiveness of our proposed method, producing comparable or sometimes superior results compared to the current state-of-the-art method.
Publisher
Springer Science and Business Media LLC
Subject
Computational Theory and Mathematics,Statistics, Probability and Uncertainty,Statistics and Probability,Theoretical Computer Science
Reference65 articles.
1. Alquier, P.: Bayesian methods for low-rank matrix estimation: short survey and theoretical study. In: International Conference on Algorithmic Learning Theory, pp. 309–323. Springer (2013)
2. Alquier, P.: User-friendly introduction to PAC-Bayes bounds. arXiv preprint arXiv:2110.11216, (2021)
3. Alquier, P., Ridgway, J., Chopin, N.: On the properties of variational approximations of Gibbs posteriors. J. Mach. Learn. Res. 17(1), 8374–8414 (2016)
4. Anderson, T.W.: Estimating linear restrictions on regression coefficients for multivariate normal distributions. Ann. Math. Stat. 22(3), 327–351 (1951)
5. Bissiri, P.G., Holmes, C.C., Walker, S.G.: A general framework for updating belief distributions. J. R. Stat. Soc. Ser. B Stat. Methodol. 78, 1103–1130 (2016)