1. Agarwal, A., Bottou, L.: A lower bound for the optimization of finite sums. In: Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, pp. 78–86, 6–11 July 2015.
http://leon.bottou.org/papers/agarwal-bottou-2015
2. Allen-Zhu, Z.: Natasha 2: faster non-convex optimization than SGD. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 31, pp. 2675–2686. Curran Associates, Inc. (2018).
http://papers.nips.cc/paper/7533-natasha-2-faster-non-convex-optimization-than-sgd.pdf
3. Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60(2), 223–311 (2018).
https://doi.org/10.1137/16M1080173
4. Chen, H.F., Gao, A.J.: Robustness analysis for stochastic approximation algorithms. Stochast. Stochast. Rep. 26(1), 3–20 (1989).
https://doi.org/10.1080/17442508908833545
5. Chen, H.F., Guo, L., Gao, A.J.: Convergence and robustness of the robbins-monro algorithm truncated at randomly varying bounds. Stoch. Processes Appl. 27, 217–231 (1987).
https://doi.org/10.1016/0304-4149(87)90039-1
,
http://www.sciencedirect.com/science/article/pii/0304414987900391