Classification and Regression Trees-Reference-Cited by-同舟云学术

Classification and Regression Trees

Published:2009 Issue: Volume: Page:192-195
ISSN:
Container-title:Encyclopedia of Data Warehousing and Mining, Second Edition
language:
Short-container-title:

Author:

Gehrke Johannes¹

Affiliation:

1. Cornell University, USA

Abstract

It is the goal of classification and regression to build a data mining model that can be used for prediction. To construct such a model, we are given a set of training records, each having several attributes. These attributes can either be numerical (for example, age or salary) or categorical (for example, profession or gender). There is one distinguished attribute, the dependent attribute; the other attributes are called predictor attributes. If the dependent attribute is categorical, the problem is a classification problem. If the dependent attribute is numerical, the problem is a regression problem. It is the goal of classification and regression to construct a data mining model that predicts the (unknown) value for a record where the value of the dependent attribute is unknown. (We call such a record an unlabeled record.) Classification and regression have a wide range of applications, including scientific experiments, medical diagnosis, fraud detection, credit approval, and target marketing (Hand, 1997). Many classification and regression models have been proposed in the literature, among the more popular models are neural networks, genetic algorithms, Bayesian methods, linear and log-linear models and other statistical methods, decision tables, and tree-structured models, the focus of this chapter (Breiman, Friedman, Olshen, & Stone, 1984). Tree-structured models, socalled decision trees, are easy to understand, they are non-parametric and thus do not rely on assumptions about the data distribution, and they have fast construction methods even for large training datasets (Lim, Loh, & Shih, 2000). Most data mining suites include tools for classification and regression tree construction (Goebel & Gruenwald, 1999).

Publisher

IGI Global

Reference14 articles.

1. Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and Regression Trees, Kluwer Academic Publishers.

2. Caruana, R., Niculescu-Mizil, A., Crew, R., & Ksikes, A. (2004). Ensemble selection from libraries of models. In Proceedings of the Twenty-first International Conference (ICML 2004), Banff, Alberta, Canada, July 4-8, 2004. ACM 2004

3. Dalvi, N., & Domingos, P. Mausam, Sanghai, S., & Verma, D. (2004). Adversarial Classification. In Proceedings of the Tenth International Conference on Knowledge Discovery and Data Mining, 99-108. Seattle, WA: ACM Press.

4. Dobra, A., & Gehrke, J. (2002). SECRET: A Scalable Linear Regression Tree Algorithm. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2002). Edmonton, Alberta, Canada.

5. Learning from Infinite Data in Finite Time.;P.Domingos;Advances in Neural Information Processing Systems,2002

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Skin Cancer Classification Using Deep Learning;Lecture Notes in Electrical Engineering;2023

2. Semi-Supervised Learning with GANs for Melanoma Detection;2022 6th International Conference on Intelligent Computing and Control Systems (ICICCS);2022-05-25

3. Artificial neural network in the discrimination of lung cancer based on infrared spectroscopy;PLOS ONE;2022-05-12

4. Fast Human Activity Recognition Based on a Massively Parallel Implementation of Random Forest;Intelligent Information and Database Systems;2016