The Estimation of Distributions and the Minimum Relative Entropy Principle-Reference-Cited by-同舟云学术

The Estimation of Distributions and the Minimum Relative Entropy Principle

Published:2005-03 Issue:1 Volume:13 Page:1-27
ISSN:1063-6560
Container-title:Evolutionary Computation
language:en
Short-container-title:Evolutionary Computation

Author:

Mühlenbein Heinz¹,Höns Robin¹

Affiliation:

1. Fraunhofer Institute for Autonomous Intelligent Systems 53754 Sankt Augustin, Germany,

Abstract

Estimation of Distribution Algorithms (EDA) have been proposed as an extension of genetic algorithms. In this paper we explain the relationship of EDA to algorithms developed in statistics, artificial intelligence, and statistical physics. The major design issues are discussed within a general interdisciplinary framework. It is shown that maximum entropy approximations play a crucial role. All proposed algorithms try to minimize the Kullback-Leibler divergence KLD between the unknown distribution p(x) and a class q(x) of approximations. However, the Kullback-Leibler divergence is not symmetric. Approximations which suppose that the function to be optimized is additively decomposed (ADF) minimize KLD(q||p), the methods which learn the approximate model from data minimize KLD(p||q). This minimization is identical to maximizing the log-likelihood. In the paper three classes of algorithms are discussed. FDAuses the ADF to compute an approximate factorization of the unknown distribution. The factors are marginal distributions, whose values are computed from samples. The second class is represented by the Bethe-Kikuchi approach which has recently been rediscovered in statistical physics. Here the values of the marginals are computed from a difficult constrained minimization problem. The third class learns the factorization from the data. We analyze our learning algorithm LFDA in detail. It is shown that learning is faced with two problems: first, to detect the important dependencies between the variables, and second, to create an acyclic Bayesian network of bounded clique size.

Publisher

MIT Press - Journals

Subject

Computational Mathematics

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/1063656053583469

Reference15 articles.

1. Statistical theory of superlattices

2. $I$-Divergence Geometry of Probability Distributions and Minimization Problems

3. Inference in belief networks: A procedural guide

4. Information Theory and Statistical Mechanics

5. A Theory of Cooperative Phenomena

Cited by 59 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Factorized models in neural architecture search: Impact on computational costs and performance;2024 International Joint Conference on Neural Networks (IJCNN);2024-06-30

2. Maneuvering Extended Object Tracking with Modified Star-Convex Random Hypersurface Model Based on Minimum Cosine Distance;Remote Sensing;2022-09-03

3. Concept Driven Search and Visualization System for Exploring Scientific Repositories;8th ACM IKDD CODS and 26th COMAD;2020-12-27

4. Symmetric-Approximation Energy-Based Estimation of Distribution (SEED): A Continuous Optimization Algorithm;IEEE Access;2019

5. Dual mean field search for large scale linear and quadratic knapsack problems;Physica A: Statistical Mechanics and its Applications;2017-07