1. Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning internal representations by error propagation. In Parallel distributed processing: Explorations in the microstructure of cognition (Chap. 8). Cambridge: MIT Press.
2. Fahlman, S. E. (1988). Faster-learning variations on backpropagation: An empirical study. In Proceedings of the 1988 Connectionist Models Summer School (pp. 38–51).
3. Chen, J. R., & Mars, Stepsize variation methods for accelerating the back-propagation algorithm. In Proceedings of the International Joint Conference on Neural Networks (Vol. 1, pp. 601–604).
4. Ng, S. C., Cheung, C.-C., & Leung, S. H. (2004). Magnified gradient function with deterministic weight evolution in adaptive learning. IEEE Transactions in Neural Networks, 15(6), 1411–1423.
5. Gori, M., & Tesi, A. (1992). On the problem of local minima in back-propagation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(1), 76–86.