1. Bartlett, P., & Mendelson, S. (2003). Rademacher and gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3, 463–482.
2. Bridle, J. (1990). Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In F. F. Soulie & J. Herault (Eds.), Neurocomputing: Algorithms architectures and applications (pp. 227–236). Berlin: Springer.
3. Burges, C. J., Le, Q. V., & Ragno, R. (2007). Learning to rank with nonsmooth cost functions. In Schölkopf, J. Platt, & T. Hofmann (Eds.), Advances in neural information processing systems (Vol. 19).
4. Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., et al. (2005). Learning to rank using gradient descent. In Proceedings of the international conference on machine learning.
5. Cao, Y., Xu, J., Liu, T. Y., Li, H,. Huang, Y., & Hon, H. W. (2006). Adapting ranking SVM to document retrieval. In SIGIR.