Affiliation:
1. Sony Computer Science Laboratories, Tokyo 141-0022, Japan
Abstract
Exponential families are statistical models which are the workhorses in statistics, information theory, and machine learning, among others. An exponential family can either be normalized subtractively by its cumulant or free energy function, or equivalently normalized divisively by its partition function. Both the cumulant and partition functions are strictly convex and smooth functions inducing corresponding pairs of Bregman and Jensen divergences. It is well known that skewed Bhattacharyya distances between the probability densities of an exponential family amount to skewed Jensen divergences induced by the cumulant function between their corresponding natural parameters, and that in limit cases the sided Kullback–Leibler divergences amount to reverse-sided Bregman divergences. In this work, we first show that the α-divergences between non-normalized densities of an exponential family amount to scaled α-skewed Jensen divergences induced by the partition function. We then show how comparative convexity with respect to a pair of quasi-arithmetical means allows both convex functions and their arguments to be deformed, thereby defining dually flat spaces with corresponding divergences when ordinary convexity is preserved.
Reference53 articles.
1. Amari, S.I. (2016). Information Geometry and Its Applications, Springer. Applied Mathematical Sciences.
2. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming;Bregman;USSR Comput. Math. Math. Phys.,1967
3. Nielsen, F., and Hadjeres, G. (2019). Geometric Structures of Information, Springer.
4. Brown, L.D. (1986). Lecture Notes-Monograph Series, Cornell University.
5. Legendre structure of κ-thermostatistics revisited in the framework of information geometry;Scarfone;J. Phys. Math. Theor.,2014