Publisher
Springer International Publishing
Reference20 articles.
1. Cesa-Bianchi, N., Long, P.M., Warmuth, M.: Worst-case quadratic loss bounds for prediction using linear functions and gradient descent. IEEE Trans. Neural Networks 7(3), 604–619 (1996)
2. Kivinen, J., Warmuth, M.K.: Exponentiated gradient versus gradient descent for linear predictors. Inf. Comput. 132(1), 1–63 (1997)
3. Bottou, L., LeCun, Y.: Large scale online learning. In: Advances in Neural Information Processing Systems 16. MIT Press, Cambridge, MA (2004). Location (1999)
4. Zhang, T.: Solving large scale linear prediction problems using stochastic gradient descent algorithms. In: Proceedings of the 21st International Conference on Machine Learning (ICML), Banff, Alberta, Canada (2004)
5. McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: the sequential learning problem. Psychol. Learn. Motivation 24, 109–165 (1989)