1. Optimization methods for large-scale machine learning;Bottou;Siam Rev.,2018
2. Stopping criteria for, and strong convergence of, stochastic gradient descent on Bottou-Curtis-Nocedal functions;Patel;Math. Program.,2021
3. Better theory for SGD in the nonconvex world;Khaled,2020
4. Stochastic gradient descent on nonconvex functions with general noise models;Patel,2021
5. Global convergence and stability of stochastic gradient descent;Patel,2021