1. Ba LJ, Caruana R (2014) Do deep networks really need to be deep? In: Ghahrani Z (ed) Advances in neural information processing systems, vol 27. MIT Press, Cambridge, pp 1–9
2. Ball K (1997) An elementary introduction to modern convex geometry. In: Levy S (ed) Flavors of geometry. Cambridge University Press, Cambridge, pp 1–58
3. Barron AR (1992) Neural net approximation. In: Narendra KS (ed) Proceedings of 7th Yale workshop on adaptive and learning systems. Yale University Press, New Haven, pp 69–72
4. Barron AR (1993) Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans Inf Theory 39:930–945
5. Bellman R (1957) Dynamic programming. Princeton University Press, Princeton