Affiliation:
1. Eskişehir Teknik Üniversitesi
2. TC Vakıflar Bankası
Abstract
In data mining, classification builds an interdisciplinary field upon from statistics, computer science, mathematics and many other disciplines. There are numerous statistical applications where parametric and non-parametric methods are frequently used to train data to estimate mapping function. In this study, two of the most widely used techniques are applied to a real dataset. The goal of the study is to compare the classification success of ordinal logistic regression and the classification trees and to predict a categorical response. For this purpose, the potential factors affecting the housing unit price for sale as being the dependent variable with three classes in Eskişehir were examined. The real data set was split into three as train, validation and test groups. The classification performance of the techniques was demonstrated with 5-fold cross validation technique. According to the results, a more successful classification was made with the classification trees algorithm.
Publisher
SDU Journal of Natural and Applied Sciences
Reference35 articles.
1. [1] Berkson J “Application of the logistic function to bio-assay”. Journal of the American Statistical Association, 39(227), 357-365, 1944.
2. [2] Finney DJ. Probit Analysis. 3rd ed. Cambridge UK, Cambridge University Press, 1971.
3. [3] Freeman DH. Applied Categorical Data Analysis. New York USA, Marcel Dekker Inc., 1987.
4. [4] Cox DR. Analysis of Binary Data. 2nd ed. London UK, Chapman and Hall, 1970.
5. [5] Aranda-Ordaz FJ. “On two families of transformations to additive for binary response”. Biometrika, 68(2), 357-363, 1981.