1. Soulie, F.F., and Herault, J. (1990). Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition. Neurocomputing, Springer.
2. Touretzky, D.S. (1990). Training Stochastic Model Recognition Algorithms as Networks can Lead to Maximum Mutual Information Estimation of Parameters. Advances in Neural Information Processing Systems 2, Morgan-Kaufmann.
3. (2020, October 17). ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Available online: http://www.image-net.org/challenges/LSVRC/.
4. Do deep nets really need to be deep?;Ba;Adv. Neural Inf. Process. Syst.,2014
5. Hinton, G., Vinyals, O., and Dean, J. (2015). Distilling the Knowledge in a Neural Network. arXiv.