Publisher
Springer International Publishing
Reference13 articles.
1. Bosman, A.S., Engelbrecht, A., Helbig, M.: Visualising basins of attraction for the cross-entropy and the squared error neural network loss functions. Neurocomputing 400, 113–136 (2020). https://doi.org/10.1016/j.neucom.2020.02.113
2. Bourlard, H.A., Morgan, N.: Connectionist Speech Recognition. Springer, Boston (1994). https://doi.org/10.1007/978-1-4615-3210-1
3. Chaudhari, P., et al.: Entropy-SGD: biasing gradient descent into wide valleys. J. Stat. Mech: Theory Exp. 2019(12), 124018 (2019)
4. Golik, P., Doetsch, P., Ney, H.: Cross-entropy vs. squared error training: a theoretical and experimental comparison. In: 14th Annual Conference of the International Speech Communication Association, pp. 1756–1760. ISCA (2013)
5. Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.T.P.: On large-batch training for deep learning: generalization gap and sharp minima. arXiv preprint arXiv:1609.04836 (2016)
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献