Dataset Characteristics (Metafeatures)-Reference-Cited by-同舟云学术

Dataset Characteristics (Metafeatures)

Published:2022 Issue: Volume: Page:53-75
ISSN:1611-2482
Container-title:Metalearning
language:
Short-container-title:

Author:

Brazdil Pavel,van Rijn Jan N.,Soares Carlos,Vanschoren Joaquin

Abstract

SummaryThis chapter discusses dataset characteristics that play a crucial role in many metalearning systems. Typically, they help to restrict the search in a given configuration space. The basic characteristic of the target variable, for instance, determines the choice of the right approach. If it is numeric, it suggests that a suitable regression algorithm should be used, while if it is categorical, a classification algorithm should be used instead. This chapter provides an overview of different types of dataset characteristics, which are sometimes also referred to as metafeatures. These are of different types, and include so-called simple, statistical, information-theoretic, model-based, complexitybased, and performance-based metafeatures. The last group of characteristics has the advantage that it can be easily defined in any domain. These characteristics include, for instance,sampling landmarkersrepresenting the performance of particular algorithms on samples of data,relative landmarkerscapturing differences or ratios of performance values and providingestimates of performance gains. The final part of this chapter discusses the specific dataset characteristics used in different machine learning tasks, including classification, regression, time series, and clustering.

Publisher

Springer International Publishing

Link

https://link.springer.com/content/pdf/10.1007/978-3-030-67024-5_4

Reference81 articles.

1. Adya, M., Collopy, F., Armstrong, J., and Kennedy, M. (2001). Automatic identification of time series features for rule-based forecasting. International Journal of Forecasting, 17(2):143–157.

2. Aha, D. W. (1992). Generalizing from case studies: A case study. In Sleeman, D. and Edwards, P., editors, Proceedings of the Ninth InternationalWorkshop on Machine Learning (ML92), pages 1–10. Morgan Kaufmann.

3. Atkeson, C. G., Moore, A. W., and Schaal, S. (1997). Locally weighted learning. Artificial Intelligence Review, 11(1-5):11–73.

4. Baldi, P. and Chauvin, Y. (1993). Neural networks for fingerprint recognition. Neural Computation, 5.

5. Bensusan, H. (1998). God doesn’t always shave with Occam’s razor - learning when and how to prune. In ECML ’98: Proceedings of the 10th European Conference on Machine Learning, pages 119–124, London, UK. Springer-Verlag.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Applicability Area: A novel utility-based approach for evaluating predictive models, beyond discrimination;2023-07-07