Affiliation:
1. Department of Computer Science, University of Texas at El Paso, TX, USA
2. Department of Teacher Education, University of Texas at El Paso, TX, USA
Abstract
Neural networks – specifically, deep neural networks – are, at present, the most effective machine learning techniques. There are reasonable explanations of why deep neural networks work better than traditional “shallow” ones, but the question remains: why neural networks in the first place? why not networks consisting of non-linear functions from some other family of functions? In this paper, we provide a possible theoretical answer to this question: namely, we show that of all families with the smallest possible number of parameters, families corresponding to neurons are indeed optimal – for all optimality criteria that satisfy some reasonable requirements: namely, for all optimality criteria which are final and invariant with respect to coordinate changes, changes of measuring units, and similar linear transformations.
Subject
Artificial Intelligence,General Engineering,Statistics and Probability
Reference1 articles.
1. Goodfellow I. , Bengio Y. , Courville A. Deep Learning, MIT Press, Cambridge, Massachusetts, 2016.