1. Agarwal, A., Bottou, L.: A lower bound for the optimization of finite sums. In: Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6–11 July 2015, pp. 78–86. JMLR Workshop and Conference Proceedings, France (2015). http://leon.bottou.org/papers/agarwal-bottou-2015
2. Agarwal, N., Allen-Zhu, Z., Bullins, B., Hazan, E., Ma, T.: Finding approximate local minima faster than gradient descent. In: STOC 2017 - Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pp. 1195–1199. Association for Computing Machinery, June 2017. https://doi.org/10.1145/3055399.3055464
3. Allen-Zhu, Z., Hazan, E.: Variance reduction for faster non-convex optimization. In: Balcan, M., Weinberger, K. (eds.) 33rd International Conference on Machine Learning, ICML 2016, pp. 1093–1101. International Machine Learning Society (IMLS), January 2016
4. Allen-Zhu, Z.: Natasha 2: faster non-convex optimization than SGD. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 31, pp. 2675–2686. Curran Associates, Inc. (2018)
5. Babanezhad Harikandeh, R., Ahmed, M.O., Virani, A., Schmidt, M., Konečný, J., Sallinen, S.: Stopwasting my gradients: practical SVRG. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28. Curran Associates, Inc. (2015)