1. Léon, B.: Large-scale machine learning with stochastic gradient descent. In: Proceedings of the 19th International Conference on Computational Statistics (COMPSTAT 2010), pp. 177–187. Springer, Paris, France, August 2010 (2010)
2. Mu, L., Tong, Z., Chen, Y., Smola, A.J.: Efficient mini-batch training for stochastic optimization. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 661–670. ACM (2014)
3. Benjamin, R., Christopher, R., Stephen, W., Feng, N.: HOGWILD!: a lock-free approach to parallelizing stochastic gradient descent. In: Advances in Neural Information Processing Systems, pp. 693–701 (2011)
4. Schraudolph, N., Jin, Y., Günter, S.: A stochastic quasi-newton method for online convex optimization. J. Machine Learn. Res. 2, 428–435 (2007)
5. Shalev-Shwartz, S., Zhang, T.: Stochastic dual coordinate ascent methods for regularized loss. J. Mach. Learn. Res. 14(1), 567–599 (2013)