The k conditional nearest neighbor algorithm for classification and class probability estimation-Reference-Cited by-同舟云学术

The k conditional nearest neighbor algorithm for classification and class probability estimation

Published:2019-05-13 Issue: Volume:5 Page:e194
ISSN:2376-5992
Container-title:PeerJ Computer Science
language:en
Short-container-title:

Author:

Gweon Hyukjun¹,Schonlau Matthias²,Steiner Stefan H.²

Affiliation:

1. University of Western Ontario, London, Canada

2. University of Waterloo, Waterloo, Canada

Abstract

The k nearest neighbor (kNN) approach is a simple and effective nonparametric algorithm for classification. One of the drawbacks of kNN is that the method can only give coarse estimates of class probabilities, particularly for low values of k. To avoid this drawback, we propose a new nonparametric classification method based on nearest neighbors conditional on each class: the proposed approach calculates the distance between a new instance and the kth nearest neighbor from each class, estimates posterior probabilities of class memberships using the distances, and assigns the instance to the class with the largest posterior. We prove that the proposed approach converges to the Bayes classifier as the size of the training data increases. Further, we extend the proposed approach to an ensemble method. Experiments on benchmark data sets show that both the proposed approach and the ensemble version of the proposed approach on average outperform kNN, weighted kNN, probabilistic kNN and two similar algorithms (LMkNN and MLM-kHNN) in terms of the error rate. A simulation shows that kCNN may be useful for estimating posterior probabilities when the class distributions overlap.

Funder

Social Sciences and Humanities Research Council of Canada

Publisher

PeerJ

Subject

General Computer Science

Link

https://peerj.com/articles/cs-194.pdf

Reference30 articles.

1. Shape matching and object recognition using shape contexts;Belongie;IEEE Transactions on Pattern Analysis and Machine Intelligence,2002

2. Survey of nearest neighbor techniques;Bhatia;International Journal of Computer Science and Information Security,2010

3. Random forests;Breiman;Machine Learning,2001

4. Statistical comparisons of classifiers over multiple data sets;Demšar;Journal of Machine Learning Research,2006

5. The distance-weighted k-nearest-neighbor rule;Dudani;IEEE Transactions on Systems, Man, and Cybernetics,1976

Cited by 17 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A flexible and efficient model coupling multi-type data for 2D/3D stratigraphic modeling;Bulletin of Engineering Geology and the Environment;2024-04-24

2. Bayes Classification Using an Approximation to the Joint Probability Distribution of the Attributes;Communications in Computer and Information Science;2024

3. Enhancing Breast Cancer Prediction with an Advanced K-Nearest Neighbors (KNN) Algorithm Integrated with Feedback Support Mechanism;2023 International Conference on Technology, Engineering, and Computing Applications (ICTECA);2023-12-20

4. A Machine Learning Approach to Assess Patients with Deep Neck Infection Progression to Descending Mediastinitis: Preliminary Results;Diagnostics;2023-08-23

5. Speech Analysis of Patients with Cleft Palate Using Artificial Intelligence Techniques: A Systematic Review;FACE;2023-07-24