Learning k for kNN Classification-Reference-Cited by-同舟云学术

Learning k for kNN Classification

Published:2017-04-22 Issue:3 Volume:8 Page:1-19
ISSN:2157-6904
Container-title:ACM Transactions on Intelligent Systems and Technology
language:en
Short-container-title:ACM Trans. Intell. Syst. Technol.

Author:

Zhang Shichao¹,Li Xuelong²,Zong Ming¹,Zhu Xiaofeng¹,Cheng Debo¹

Affiliation:

1. Guangxi Key Lab of MIMS 8 Guangxi Normal University, Guilin, Guangxi, PR China

2. Chinese Academy of Sciences, Shaanxi, P. R. China

Abstract

The K Nearest Neighbor (kNN) method has widely been used in the applications of data mining and machine learning due to its simple implementation and distinguished performance. However, setting all test data with the same k value in the previous kNN methods has been proven to make these methods impractical in real applications. This article proposes to learn a correlation matrix to reconstruct test data points by training data to assign different k values to different test data points, referred to as the Correlation Matrix kNN (CM-kNN for short) classification. Specifically, the least-squares loss function is employed to minimize the reconstruction error to reconstruct each test data point by all training data points. Then, a graph Laplacian regularizer is advocated to preserve the local structure of the data in the reconstruction process. Moreover, an ℓ 1 -norm regularizer and an ℓ 2, 1 -norm regularizer are applied to learn different k values for different test data and to result in low sparsity to remove the redundant/noisy feature from the reconstruction process, respectively. Besides for classification tasks, the kNN methods (including our proposed CM-kNN method) are further utilized to regression and missing data imputation. We conducted sets of experiments for illustrating the efficiency, and experimental results showed that the proposed method was more accurate and efficient than existing kNN methods in data-mining applications, such as classification, regression, and missing data imputation.

Funder

China Key Research Program

National Natural Science Foundation of China

Guangxi Higher Institutions' Program of Introducing 100 High-Level Overseas Talents

Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent Processing

China 973 Program

Guangxi “Bagui” Teams for Innovation and Research

Guangxi Natural Science Foundation

Publisher

Association for Computing Machinery (ACM)

Subject

Artificial Intelligence,Theoretical Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/2990508

Reference55 articles.

1. Nearest Neighbor Classification of Remote Sensing Images With the Maximal Margin Principle

2. Jackknife Variance Estimation for Nearest-Neighbor Imputation

3. Nonconvex plus quadratic penalized low-rank and sparse decomposition for noisy image alignment

4. Debo Cheng Shichao Zhang Xingyi Liu Ke Sun and Ming Zong. 2015. Feature selection by combining subspace learning with sparse representation. Multimedia Syst. (2015) 1--7. 10.1007/s00530-015-0487-0 Debo Cheng Shichao Zhang Xingyi Liu Ke Sun and Ming Zong. 2015. Feature selection by combining subspace learning with sparse representation. Multimedia Syst. (2015) 1--7. 10.1007/s00530-015-0487-0

5. Iteratively reweighted least squares minimization for sparse recovery

Cited by 430 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. G-EEGCS: Graph-based optimum electroencephalogram channel selection;Biomedical Signal Processing and Control;2024-12

2. On the use of machine learning for predicting femtosecond laser grooves in tribological applications;Tribology International;2024-12

3. Accurate Loss Prediction of Realistic Hollow-Core Anti-Resonant Fibers Using Machine Learning;IEEE Journal of Selected Topics in Quantum Electronics;2024-11

4. Predictive modeling of flow characteristics in supersonic separators using machine learning;Fuel;2024-10

5. CDRM: Causal disentangled representation learning for missing data;Knowledge-Based Systems;2024-09