1. I. Goodfellow, Y. Bengio, A. Courville, Deep Learning (MIT Press, 2016)
2. B.D. Haeffele, R. Vidal, Global optimality in neural network training, in 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 4390–4398
3. R. Vidal, J. Bruna, R. Giryes, S. Soatto, Mathematics of deep learning, in Proceedings of the Conference on Decision and Control (CDC) (2017)
4. V.N. Vapnik, A. Chervonenkis, The necessary and sufficient conditions for consistency in the empirical risk minimization method. Pattern Recogn. Image Anal. 1(3), 260–284 (1991)
5. P.L. Bartlett, S. Mendelson, Rademacher and Gaussian complexities: risk bounds and structural results. J. Mach. Learn. Res. 3(3), 463–482 (2002)