Affiliation:
1. Istituto Dalle Molle di Studi Sull’Intelligenza Artificiale (IDSIA), 6900 Lugano, Switzerland
Abstract
Bayesian networks (BNs) are a foundational model in machine learning and causal inference. Their graphical structure can handle high-dimensional problems, divide them into a sparse collection of smaller ones, underlies Judea Pearl’s causality, and determines their explainability and interpretability. Despite their popularity, there are almost no resources in the literature on how to compute Shannon’s entropy and the Kullback–Leibler (KL) divergence for BNs under their most common distributional assumptions. In this paper, we provide computationally efficient algorithms for both by leveraging BNs’ graphical structure, and we illustrate them with a complete set of numerical examples. In the process, we show it is possible to reduce the computational complexity of KL from cubic to quadratic for Gaussian BNs.
Reference55 articles.
1. Scutari, M., and Denis, J.B. (2021). Bayesian Networks with Examples in R, Chapman & Hall. [2nd ed.].
2. Castillo, E., Gutiérrez, J.M., and Hadi, A.S. (1997). Expert Systems and Probabilistic Network Models, Springer.
3. Cowell, R.G., Dawid, A.P., Lauritzen, S.L., and Spiegelhalter, D.J. (1999). Probabilistic Networks and Expert Systems, Springer.
4. Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann.
5. Koller, D., and Friedman, N. (2009). Probabilistic Graphical Models: Principles and Techniques, MIT Press.