1. Aizerman, M. A., Braverman, É.. M., and Rozonoér, L. I. (1964). Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control 25: 821–837.
2. Alon, N., Ben-David, S., Cesa-Bianchi, N., and Haussier, D. (1997). Scale-sensitive Dimensions, Uniform Convergence, and Learnability. Journal of the ACM 44 (4): 615–631.
3. Aronszajn, N. (1950). Theory of reproducing kernels. Transactions of the American Mathematical Society 68: 337–404.
4. Bartlett, P. L., and Shawe-Taylor, J. (1999). Generalization performance of support vector machines and other pattern classifiers. In Schölkopf, B., Burges, C. J. C., and Smola, A. J., eds., Advances in Kernel Methods — Support Vector Learning, 43–54. Cambridge, MA: MIT Press.
5. Berg, C., Christensen, J. P. R., and Ressel, R. (1984). Harmonic Analysis on Semigroups. New York: Springer-Verlag.