1. D.P. Kingma, J. Ba, Adam: a method for stochastic optimization, in 3rd International Conference on Learning Representations, ICLR 2015—Conference Track Proceeding, Dec. 2014, Accessed: Jul. 28, 2020. (Online). Available: http://arxiv.org/abs/1412.6980
2. G.H.T. Tieleman, Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA Neural Networkd Mach. Learn. 4(2), 26–31 (2012)
3. M.D. Zeiler, ADADELTA: an adaptive learning rate method. Dec 2012, Accessed: 28 July 2020. (Online). Available: http://arxiv.org/abs/1212.5701
4. A.J. Turner, J.F. Miller, Neuro evolution: evolving heterogeneous artificial neural networks. Evol. Intell. 7(3), 135–154 (2014). https://doi.org/10.1007/s12065-014-0115-5
5. A.A. ElSaid, A.G. Ororbia, T.J. Desell, The ant swarm neuro-evolution procedure for optimizing recurrent networks. Sep 2019, Accessed: 30 Apr 2020. (Online). Available: http://arxiv.org/abs/1909.11849