Affiliation:
1. Institute of Computer Science of the Czech Academy of Sciences, Prague, Czech Republic
2. Faculty of Applied Informatics, Tomas Bata University, Nad Stranemi, Zlin, Czech Republic
Abstract
Based on the analysis of conditions for a good distance function we found four rules that should be fulfilled. Then, we introduce two new distance functions, a metric and a pseudometric one. We have tested how they fit for distance-based classifiers, especially for the IINC classifier. We rank distance functions according to several criteria and tests. Rankings depend not only on criteria or nature of the statistical test, but also whether it takes into account different difficulties of tasks or whether it considers all tasks as equally difficult. We have found that the new distance functions introduced belong among the four or five best out of 23 distance functions. We have tested them on 24 different tasks, using the mean, the median, the Friedman aligned test, and the Quade test. Our results show that a suitable distance function can improve behavior of distance-based classification rules.
Funder
Czech Ministry of Education, Youth and Sports
Publisher
Association for Computing Machinery (ACM)
Reference38 articles.
1. M. Alkasassbeh G. A. Altarawnwh and A. B. Hassanat. 2015. On enhancing the performance of nearest neighbor classifiers using hassanat distance metric. Canadian Journal of Pure and Applied Science 9 1 (2015) 6.
2. Nearest Neighbor-Based Classification of Uncertain Data
3. M. Ashraf K. Le and X. Huang. 2011. Iterative weighted k-NN for constructing missing feature values in wisconsin breast cancer dataset. In Proceedings of the 3rd International Conference on Data Mining and Intelligent Information Technology Applications Macao 24–26 Oct. 2011 23–27 ISBN: 978-1-4673-0231-9 (IEEE)
4. M. Benzi J. K. Cullum and M. Tu̇ma. 2000. Robust approximate inverse preconditioning for the conjugate gradient method. SIAM Journal on Scientific Computing 22 1318–1332.
5. T. M. Cover and P. E. Hart. 1967. Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13 1 (1967) 21–27.