1. A zero-gradient-sum algorithm for distributed cooperative learning using a feedforward neural network with random weights;Ai;Inf. Sci.,2016
2. Backpropagation and stochastic gradient descent method;Amari;Neurocomputing,1993
3. A levy flight-based grey wolf optimizer combined with back-propagation algorithm for neural network training;Amirsadri;Neural Comput. Appl.,2018
4. A survey on modern trainable activation functions;Apicella;Neural Netw.,2021
5. Bergstra, J., Bardenet, R., Bengio, Y., Kégl, B., 2011. Algorithms for hyper-parameter optimization. In: Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., Weinberger, K.Q. (Eds.), 25th Annual Conference on Neural Information Processing Systems. Vol. 24. Curran Associates, Inc., pp. 2546–2554.