1. Neural Networks and Learning Machines;Haykin,2009
2. An overview of gradient descent optimization algorithms;Ruder;arXiv,2016
3. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization;Duchi;J. Mach. Learn. Res.,2011
4. ADADELTA: An Adaptive Learning Rate Method;Zeiler;arXiv,2012
5. Neural Networks for Machine Learning;Tieleman,2012