Affiliation:
1. Department of Computing and Software, McMaster University, Hamilton, ON L8S 4L8, Canada
Abstract
Classification, the task of discerning the class of an unlabeled data point using information from a set of labeled data points, is a well-studied area of machine learning with a variety of approaches. Many of these approaches are closely linked to the selection of metrics or the generalizing of similarities defined by kernels. These metrics or similarity measures often require their parameters to be tuned in order to achieve the highest accuracy for each dataset. For example, an extensive search is required to determine the value of K or the choice of distance metric in K-NN classification. This paper explores a method of kernel construction that when used in classification performs consistently over a variety of datasets and does not require the parameters to be tuned. Inspired by dimensionality reduction techniques (DRT), we construct a kernel-based similarity measure that captures the topological structure of the data. This work compares the accuracy of K-NN classifiers, computed with specific operating parameters that obtain the highest accuracy per dataset, to a single trial of the here-proposed kernel classifier with no specialized parameters on standard benchmark sets. The here-proposed kernel used with simple classifiers has comparable accuracy to the ‘best-case’ K-NN classifiers without requiring the tuning of operating parameters.
Reference40 articles.
1. Effects of distance measure choice on k-nearest neighbor classifier performance: A review;Hassanat;Big Data,2019
2. Alkasassbeh, M., Altarawneh, G.A., and Hassanat, A. (2015). On enhancing the performance of nearest neighbour classifiers using hassanat distance metric. arXiv.
3. Study of distance metrics on k-nearest neighbor algorithm for star categorization;Nayak;J. Phys. Conf. Ser.,2022
4. Hybrid metric k-nearest neighbor algorithm and applications;Zhang;Math. Probl. Eng.,2022
5. Yean, C.W., Khairunizam, W., Omar, M.I., Murugappan, M., Zheng, B.S., Bakar, S.A., Razlan, Z.M., and Ibrahim, Z. (2018, January 15–17). Analysis of the distance metrics of KNN classifier for EEG signal in stroke patients. Proceedings of the 2018 International Conference on Computational Approach in Smart Systems Design and Applications (ICASSDA), Kuching, Malaysia.