Justifying and Generalizing Contrastive Divergence-Reference-Cited by-同舟云学术

Justifying and Generalizing Contrastive Divergence

Published:2009-06 Issue:6 Volume:21 Page:1601-1621
ISSN:0899-7667
Container-title:Neural Computation
language:en
Short-container-title:Neural Computation

Author:

Bengio Yoshua¹,Delalleau Olivier¹

Affiliation:

1. Department of Computer Science and Operations Research, University of Montreal, Montreal, Quebec, Canada

Abstract

We study an expansion of the log likelihood in undirected graphical models such as the restricted Boltzmann machine (RBM), where each term in the expansion is associated with a sample in a Gibbs chain alternating between two random variables (the visible vector and the hidden vector in RBMs). We are particularly interested in estimators of the gradient of the log likelihood obtained through this expansion. We show that its residual term converges to zero, justifying the use of a truncation—running only a short Gibbs chain, which is the main idea behind the contrastive divergence (CD) estimator of the log-likelihood gradient. By truncating even more, we obtain a stochastic reconstruction error, related through a mean-field approximation to the reconstruction error often used to train autoassociators and stacked autoassociators. The derivation is not specific to the particular parametric forms used in RBMs and requires only convergence of the Gibbs chain. We present theoretical and empirical evidence linking the number of Gibbs steps k and the magnitude of the RBM parameters to the bias in the CD estimator. These experiments also suggest that the sign of the CD estimator is correct most of the time, even when the bias is large, so that CD-k is a good descent direction even for small k.

Publisher

MIT Press - Journals

Subject

Cognitive Neuroscience,Arts and Humanities (miscellaneous)

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/neco.2008.11-07-647

Reference25 articles.

1. Auto-association by multilayer perceptrons and singular value decomposition

Cited by 110 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Learning restricted Boltzmann machines with pattern induced weights;Neurocomputing;2024-12

2. Stochastic artificial neuron based on Ovonic Threshold Switch (OTS) and its applications for Restricted Boltzmann Machine (RBM);Chaos, Solitons & Fractals;2024-09

3. Data-Driven Robust Adaptive Control With Deep Learning for Wastewater Treatment Process;IEEE Transactions on Industrial Informatics;2024-01

4. The Capabilities of Boltzmann Machines to Detect and Reconstruct Ising System’s Configurations from a Given Temperature;Entropy;2023-12-12

5. Restricted Boltzmann Machines;Neural Networks and Deep Learning;2023