Abstract
AbstractRestricted Boltzmann machines (RBMs) are a powerful class of generative models, but their training requires computing a gradient that, unlike supervised backpropagation on typical loss functions, is notoriously difficult even to approximate. Here, we show that properly combining standard gradient updates with an off-gradient direction, constructed from samples of the RBM ground state (mode), improves training dramatically over traditional gradient methods. This approach, which we call ‘mode-assisted training’, promotes faster training and stability, in addition to lower converged relative entropy (KL divergence). We demonstrate its efficacy on synthetic datasets where we can compute KL divergences exactly, as well as on a larger machine learning standard (MNIST). The proposed mode-assisted training can be applied in conjunction with any given gradient method, and is easily extended to more general energy-based neural network structures such as deep, convolutional and unrestricted Boltzmann machines.
Publisher
Springer Science and Business Media LLC
Subject
General Physics and Astronomy
Reference25 articles.
1. Ackley, D. H., Hinton, G. E. & Sejnowski, T. J. A learning algorithm for Boltzmann machines. Cogn. Sci. 9, 147–169 (1985).
2. Goodfellow, I., Bengio, Y., Courville, A. & Bengio, Y. Deep Learning 1 (MIT Press, Cambridge, 2016).
3. LeRoux, N. & Bengio, Y. Representational power of restricted Boltzmann machines and deep belief networks. Neural Comput. 20, 1631–1649 (2008).
4. Bengio, Y. et al. Learning deep architectures for ai. Found. Trends® Mach. Learn. 2, 1–127 (2009).
5. Carleo, G. & Troyer, M. Solving the quantum many-body problem with artificial. Neural Netw. Sci. 355, 602–606 (2017).
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献