The Existence of A Priori Distinctions Between Learning Algorithms-Reference-Cited by-同舟云学术

The Existence of A Priori Distinctions Between Learning Algorithms

Published:1996-10 Issue:7 Volume:8 Page:1391-1420
ISSN:0899-7667
Container-title:Neural Computation
language:en
Short-container-title:Neural Computation

Author:

Wolpert David H.¹

Affiliation:

1. The Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM, 87501, USA

Abstract

This is the second of two papers that use off-training set (OTS) error to investigate the assumption-free relationship between learning algorithms. The first paper discusses a particular set of ways to compare learning algorithms, according to which there are no distinctions between learning algorithms. This second paper concentrates on different ways of comparing learning algorithms from those used in the first paper. In particular this second paper discusses the associated a priori distinctions that do exist between learning algorithms. In this second paper it is shown, loosely speaking, that for loss functions other than zero-one (e.g., quadratic loss), there are a priori distinctions between algorithms. However, even for such loss functions, it is shown here that any algorithm is equivalent on average to its “randomized” version, and in this still has no first principles justification in terms of average error. Nonetheless, as this paper discusses, it may be that (for example) cross-validation has better head-to-head minimax properties than “anti-cross-validation” (choose the learning algorithm with the largest cross-validation error). This may be true even for zero-one loss, a loss function for which the notion of “randomization” would not be relevant. This paper also analyzes averages over hypotheses rather than targets. Such analyses hold for all possible priors over targets. Accordingly they prove, as a particular example, that cross-validation cannot be justified as a Bayesian procedure. In fact, for a very natural restriction of the class of learning algorithms, one should use anti-cross-validation rather than cross-validation (!).

Publisher

MIT Press - Journals

Subject

Cognitive Neuroscience,Arts and Humanities (miscellaneous)

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/neco.1996.8.7.1391

Cited by 70 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Malware Prediction Using Tabular Deep Learning Models;Advances in Intelligent Systems and Computing;2024

2. References;Reconstructing Olduvai;2024

3. Impossibility Results in AI: A Survey;ACM Computing Surveys;2023-08-25

4. The Implications of the No-Free-Lunch Theorems for Meta-induction;Journal for General Philosophy of Science;2023-03-13

5. Using machine learning on tree‐ring data to determine the geographical provenance of historical construction timbers;Ecosphere;2023-03