1. Kingmma, D.P., and Lei, B.J. (2015, January 7–9). Adam: A Method for Stochastic Optimization. Proceedings of the 3rd International Conference on Learning Representations, San Diego, CA, USA.
2. Tas, E. (2010, January 11–14). Learning Parameter Optimization of Stochastic Gradient Descent with Momentum for a Stochastic Quadratic. Proceedings of the 24th European Conference on Operational Research (EURO XXIV), Lisbon, Portugal.
3. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization;Duchi;J. Mach. Learn. Res.,2011
4. Ruder, S. (2016). An Overview of Gradient Descent Optimization Algorithms. Comput. Sci. arXiv.
5. ImageNet Classifcation with Deep Convolutional Neural Networks;Krizhevsky;Commun. ACM,2017