1. Barré, M., Taylor, A., d’Aspremont, A.: Complexity guarantees for Polyak steps with momentum. In: 33rd Annual Conference on Learning Theory, Proceedings of Machine Learning Research, Vol. 125, pp. 1–27 (2020)
2. Defazio, A.: Momentum via primal averaging: theoretical insights and learning rate schedules for non-convex optimization. arXiv:2010.00406pdf (2020)
3. Défossez, A., Bottou, L., Bach, F., Usunier, N.: A simple convergence proof of Adam and Adagrad. Transactions on Machine Learning Research. arXiv:2003.02395pdf (2022)
4. Diakonioklas, J., Jordan, M.I.: Generalized momentum-based methods: a Hamiltonian perspective. SIAM J. Optim. 31(1), 915–944 (2021)
5. Ganesh, S., Deb, R., Thoppe, G., Budhiraja, A.: Does momentum help in stochastic optimization? A sample complexity analysis. arXiv:2110.15547v3 (2022)