Abstract
AbstractUnderstanding the inner behaviour of multilayer perceptrons during and after training is a goal of paramount importance for many researchers worldwide. This article experimentally shows that relevant patterns emerge upon training, which are typically related to the underlying problem difficulty. The occurrence of these patterns is highlighted by means of $$\langle \varphi ,\delta \rangle$$
⟨
φ
,
δ
⟩
diagrams, a 2D graphical tool originally devised to support the work of researchers on classifier performance evaluation and on feature assessment. The underlying assumption being that multilayer perceptrons are powerful engines for feature encoding, hidden layers have been inspected as they were in fact hosting new input features. Interestingly, there are problems that appear difficult if dealt with using a single hidden layer, whereas they turn out to be easier upon the addition of further layers. The experimental findings reported in this article give further support to the standpoint according to which implementing neural architectures with multiple layers may help to boost their generalisation ability. A generic training strategy inspired by some relevant recommendations of deep learning has also been devised. A basic implementation of this strategy has been thoroughly used during the experiments aimed at identifying relevant patterns inside multilayer perceptrons. Further experiments performed in a comparative setting have shown that it could be adopted as viable alternative to the classical backpropagation algorithm.
Publisher
Springer Science and Business Media LLC
Reference60 articles.
1. Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature. https://doi.org/10.1038/323533a0 (1986).
2. Lo, J. T.-H., Gui, Y. & Peng, Y. Overcoming the local-minimum problem in training multilayer perceptrons with the NRAE training method. In Proc. of the 9th Int. Symposium on Neural Networks, vol. 7367, 440–447 (Springer LNCS, New York, 2012).
3. Atakulreka, A. & Sutivong, D. Avoiding local minima in feedforward neural networks by simultaneous learning. In Proc. of the 20th Australasian Joint Conference on Artificial Intelligence, vol. 4830, 100–109 (Springer LNCS, New York, 2007).
4. Choromanska, A., Henaff, M., Mathieu, M., Arous, G. B. & LeCun, Y. The Loss Surfaces of Multilayer Networks. In Lebanon, G., & Vishwanathan S. V. N. (eds.) vol. 38 of JMLR Workshop and Conference Proceedings Series, Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics 192–204 (San Diego, California, 2015).
5. Pascanu, R., Dauphin, Y. N., Ganguli, S. & Bengio, Y. On the saddle point problem for non-convex optimization. CoRR (2014).
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献