Abstract
In this paper we propose a procedure to enable the training of several independent Multilayer Perceptron Neural Networks with a different number of neurons and activation functions in parallel (ParallelMLPs) by exploring the principle of locality and parallelization capabilities of modern CPUs and GPUs. The core idea of this technique is to represent several sub-networks as a single large network and use a Modified Matrix Multiplication that replaces an ordinal matrix multiplication with two simple matrix operations that allow separate and independent paths for gradient flowing. We have assessed our algorithm in simulated datasets varying the number of samples, features and batches using 10,000 different models as well as in the MNIST dataset. We achieved a training speedup from 1 to 4 orders of magnitude if compared to the sequential approach. The code is available online.
Subject
Industrial and Manufacturing Engineering
Reference20 articles.
1. Hyperopt: A python library for model selection and hyperparameter optimization;Bergstra;Comput. Sci. Discov.,2015
2. Multilayer feedforward networks with a nonpolynomial activation function can approximate any function;Leshno;Neural Netw.,1993
3. Scalable parallel programming with cuda: Is cuda the parallel programming model that application developers have been waiting for?;Nickolls;Queue,2008
4. Gregg, C., and Hazelwood, K. (2011, January 10–12). Where is the data? Why you cannot debate CPU vs. GPU performance without the answer. Proceedings of the (IEEE ISPASS) IEEE International Symposium on Performance Analysis of Systems and Software, IEEE, Washington, DC, USA.
5. Random search for hyper-parameter optimization;Bergstra;J. Mach. Learn. Res.,2012