Training Products of Experts by Minimizing Contrastive Divergence-Reference-Cited by-同舟云学术

Training Products of Experts by Minimizing Contrastive Divergence

Published:2002-08-01 Issue:8 Volume:14 Page:1771-1800
ISSN:0899-7667
Container-title:Neural Computation
language:en
Short-container-title:Neural Computation

Author:

Hinton Geoffrey E.¹

Affiliation:

1. Gatsby Computational Neuroscience Unit, University College London, London WC1N 3AR, U.K.,

Abstract

It is possible to combine multiple latent-variable models of the same data by multiplying their probability distributions together and then renormalizing. This way of combining individual “expert” models makes it hard to generate samples from the combined model but easy to infer the values of the latent variables of each expert, because the combination rule ensures that the latent variables of different experts are conditionally independent when given the data. A product of experts (PoE) is therefore an interesting candidate for a perceptual system in which rapid inference is vital and generation is unnecessary. Training a PoE by maximizing the likelihood of the data is difficult because it is hard even to approximate the derivatives of the renormalization term in the combination rule. Fortunately, a PoE can be trained using a different objective function called “contrastive divergence” whose derivatives with regard to the parameters can be approximated accurately and efficiently. Examples are presented of contrastive divergence learning using several types of expert on several types of data.

Publisher

MIT Press - Journals

Subject

Cognitive Neuroscience,Arts and Humanities (miscellaneous)

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/089976602760128018

Reference9 articles.

1. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images

2. Combining Probability Distributions: A Critique and an Annotated Bibliography

3. Bias/Variance Decompositions for Likelihood-Based Estimators

4. The "Wake-Sleep" Algorithm for Unsupervised Neural Networks

Cited by 2562 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Data-driven prediction of convective heat transfer coefficients in internal walls of aero-engine bearing chambers using Mind Evolution Algorithm-Enhanced Bayesian regularization neural networks;Applied Thermal Engineering;2024-12

2. Learning restricted Boltzmann machines with pattern induced weights;Neurocomputing;2024-12

3. Deep generative models for detector signature simulation: A taxonomic review;Reviews in Physics;2024-12

4. Prognostics and health management of photovoltaic systems based on deep learning: A state-of-the-art review and future perspectives;Renewable and Sustainable Energy Reviews;2024-11

5. Asymmetric double-winged multi-view clustering network for exploring diverse and consistent information;Neural Networks;2024-11