1. Annealed gradient descent for deep learning;Pan,2015
2. Online learning and stochastic approximations;Bottou;On-Line Learn. Neural Netw.,1998
3. Large scale online learning;LeCun,2004
4. A stochastic gradient method with an exponential convergence _rate for finite training sets;Roux,2012
5. Stochastic gradient descent for non-smooth optimization: Convergence results and optimal averaging schemes;Shamir,2013