1. Haykin, S.: Neural Networks: A Comprehensive Foundation. Macmillan/IEEE Press (1994)
2. Baum, E.B., Haussler, D.: What size net gives valid generalization? Neural Computation 1, 151–160 (1989)
3. Lawrence, S., Giles, C.L., Tsoi, A.C.: What Size Neural Network Gives Optimal Generalization? Convergence Properties of Backpropagation. In: Technical Report UMIACS-TR-96-22 and CS-TR-3617, Institute for Advanced Computer Studies, Univ. of Maryland (1996)
4. Caruana, R., Lawrence, S., Giles, C.L.: Overfitting in Neural Networks: Backpropagation, Conjugate Gradient, and Early Stopping. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information Processing Systems, vol. 13, pp. 402–408. MIT Press, Cambridge (2001)
5. Krogh, A., Hertz, J.A.: A simple weight decay can improve generalization. In: Moody, J.E., Hanson, S.J., Lippmann, R.P. (eds.) Advances in Neural Information Processing Systems, vol. 4, pp. 950–957. Morgan Kaufmann, San Mateo (1992)