1. Agarap, A. F. M.: Deep learning using Rectified Linear Units (ReLU),
https://arxiv.org/pdf/1803.08375 (last access: 7 February 2019), 2018.
2. Archambeau, C., Valle, M., Assenza, A., and Verleysen, M.: Assessment of
probability density estimation methods: Parzen window and finite Gaussian
mixtures, IEEE, ISCAS 2006, 21–24 May 2006, Island of Kos, Greece,
https://doi.org/10.1109/ISCAS.2006.1693317, 2006.
3. Bardenet, R. and Kegl, B.: Surrogating the surrogate: accelerating
Gaussian-process-based global optimization with a mixture cross-entropy
algorithm, in: International Conference on Machine Learning, 21–24 June
2010, Haifa, Israel, 55–62, 2010.
4. Basu, A., De, S., Mukherjee, A., and Ullah, E.: Convergence guarantees for
rmsprop and adam in nonconvex optimization and their comparison to nesterov
acceleration on autoencoders, arXiv preprint arXiv:1807.06766, available at:
https://arxiv.org/abs/1807.06766 (last access: 10 March 2019), 2018.
5. Bergstra, J. and Bengio, Y.: Random search for hyper-parameter optimization,
J. Mach. Learn. Res., 13, 281–305, 2012.