1. 254a, notes 1: Concentration of measure. https://terrytao.wordpress.com/2010/01/03/254a-notes-1-concentration-of-measure/
2. Abadi, M., Agarwal, A., et al.: Tensorflow: Large-Scale Machine Learning on Heterogeneous Distributed Systems (2016). arXiv preprint arXiv:1603.04467
3. Allen-Zhu, Z.: Katyusha: The first direct acceleration of stochastic gradient methods. J. Mach. Learn. Res. 18, 1–51 (2018)
4. Arjovsky, M., Bottou, L.: Towards Principled Methods for Training Generative Adversarial Networks (2017). arXiv preprint arXiv:1701.04862
5. Bertsekas, D.P.: Nonlinear Programming. Athena Scientific, Belmont (1999)