1. Bengio, Y., Lamblin, P., Popovici, D. and Larochelle, H. (2007) Greedy layer-wise training of deep networks. In: Advances in Neural Information Processing Systems, 19. MIT Press, Cambridge, MA, pp. 153–160.
2. Applied optimal control;Bryson,1975
3. Training invariant support vector machines;Decoste;Machine Learn.,2002
4. Training products of experts by minimizing contrastive divergence;Hinton;Neural Comput.,2002
5. The wake-sleep algorithm for self-organizing neural networks;Hinton;Science,1995