1. Bertsekas, D.P.: Nonlinear Programming. Athena Scientific, Massachusetts (1999)
2. Drori, Y., Teboulle, M.: Performance of first-order methods for smooth convex minimization: a novel approach. Math. Progr. 145(1–2), 451–482 (2014)
3. Luenberger, D.G., Ye, Y.: Linear and Nonlinear Programming. Springer, Berlin (2008)
4. Neelakantan, A., Vilnis, L., Le, Q.V., Sutskever, I., Kaiser, L., Kurach, K., Martens, J.: Adding gradient noise improves learning for very deep networks (2015). arXiv:1511.06807v1
5. Nemirovski, A.: Optimization II: numerical methods for nonlinear continuous optimization. In: Lecture Notes (1999). http://www2.isye.gatech.edu/~nemirovs/Lect_OptII.pdf