Improving the Accuracy of Nearest-Neighbor Classification Using Principled Construction and Stochastic Sampling of Training-Set Centroids-Reference-Cited by-同舟云学术

Improving the Accuracy of Nearest-Neighbor Classification Using Principled Construction and Stochastic Sampling of Training-Set Centroids

Published:2021-01-26 Issue:2 Volume:23 Page:149
ISSN:1099-4300
Container-title:Entropy
language:en
Short-container-title:Entropy

Author:

Whitelam Stephen^ORCID

Abstract

A conceptually simple way to classify images is to directly compare test-set data and training-set data. The accuracy of this approach is limited by the method of comparison used, and by the extent to which the training-set data cover configuration space. Here we show that this coverage can be substantially increased using coarse-graining (replacing groups of images by their centroids) and stochastic sampling (using distinct sets of centroids in combination). We use the MNIST and Fashion-MNIST data sets to show that a principled coarse-graining algorithm can convert training images into fewer image centroids without loss of accuracy of classification of test-set images by nearest-neighbor classification. Distinct batches of centroids can be used in combination as a means of stochastically sampling configuration space, and can classify test-set data more accurately than can the unaltered training set. On the MNIST and Fashion-MNIST data sets this approach converts nearest-neighbor classification from a mid-ranking- to an upper-ranking member of the set of classical machine-learning techniques.

Funder

U.S. Department of Energy

Publisher

MDPI AG

Subject

General Physics and Astronomy

Link

https://www.mdpi.com/1099-4300/23/2/149/pdf

Reference35 articles.

1. Deep learning

2. Pattern recognition and machine learning;Nasrabadi;J. Electron. Imaging,2007

3. Modeling the manifolds of images of handwritten digits

4. Learning efficient classification procedures and their application to chess end games;Quinlan,1983