1. Aizenberg I, Aizenberg NN, Vandewalle JPL (2000) Multi-valued and universal binary neurons: theory, learning and applications. Springer, Boston. First work to introduce the term “Deep Learning” to Neural Networks
2. AMAmemory (2015) Answer at reddit AMA (Ask Me Anything) on “memory networks” etc (with references)
http://www.reddit.com/r/MachineLearning/comments/2xcyrl/i_am_j%C3%BCrgen_schmidhuber_ama/cp0q12t
3. Amari S-I (1998) Natural gradient works efficiently in learning. Neural Comput 10(2):251–276
4. Baird H (1990) Document image defect models. In: Proceedings of IAPR workshop on syntactic and structural pattern recognition, Murray Hill
5. Baldi P, Pollastri G (2003) The principled design of large-scale recursive neural network architectures – DAG-RNNs and the protein structure prediction problem. J Mach Learn Res 4:575–602