Affiliation:
1. York University, Canada
Abstract
Generally speaking, classification is the action of assigning an object to a category according to the characteristics of the object. In data mining, classification refers to the task of analyzing a set of pre-classified data objects to learn a model (or a function) that can be used to classify an unseen data object into one of several predefined classes. A data object, referred to as an example, is described by a set of attributes or variables. One of the attributes describes the class that an example belongs to and is thus called the class attribute or class variable. Other attributes are often called independent or predictor attributes (or variables). The set of examples used to learn the classification model is called the training data set. Tasks related to classification include regression, which builds a model from training data to predict numerical values, and clustering, which groups examples to form categories. Classification belongs to the category of supervised learning, distinguished from unsupervised learning. In supervised learning, the training data consists of pairs of input data (typically vectors), and desired outputs, while in unsupervised learning there is no a priori output. Classification has various applications, such as learning from a patient database to diagnose a disease based on the symptoms of a patient, analyzing credit card transactions to identify fraudulent transactions, automatic recognition of letters or digits based on handwriting samples, and distinguishing highly active compounds from inactive ones based on the structures of compounds for drug discovery.
Reference20 articles.
1. An, A., & Cercone, N. (1998). ELEM2: A learning system for more accurate classifications. Proceedings of the 12th Canadian Conference on Artificial Intelligence (pp. 426-441).
2. Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and regression trees. Wadsworth International Group.
3. Operations for learning with graphical models.;W. L.Buntine;Journal of Artificial Intelligence Research,1994
4. Castillo, E., Gutiérrez, J. M., & Hadi, A. S. (1997). Expert systems and probabilistic network models. New York: Springer-Verlag.
5. Clark, P., & Boswell, R. (1991). Rule induction with CN2: Some recent improvements. Proceedings of the 5th European Working Session on Learning (pp. 151-163).
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献