Affiliation:
1. KÜTAHYA DUMLUPINAR ÜNİVERSİTESİ
Abstract
Machine learning algorithms are widely used in product sorting processes in the food industry. The
attributes of the products are used in the classification process. Attributes vary for each product. In this
study, using the k nearest neighbor (KNN) algorithm, the classification of the wheat groups of Kama,
Rosa and Canada was performed. The Seeds dataset provided in UCI (University of California, Irvine)
machine learning open source data storage was used. There are 70 examples of each wheat class in the
data set. In addition, the classification estimation success of distance metrics and the number of training
data was measured. Each of the wheat samples was randomly selected and a soft X-ray technique was
used to visualize the inner core structure of the wheat in the experimental environment with high
quality. According to the training rates ranging from 50% to 90% of the data set, the classification
success of the KNN algorithm was tested. In the KNN algorithm, the neighborhood values 1, 3 and 5
were selected to affect the classification success. The successes of the Euclidean, Chebyshev,
Manhattan and Mahalanobis distance metric methods of the KNN algorithm were tested according to
each k neighborhood value. According to the results obtained, with the Mahalanobis metric method, a
classification success rate of 0.9924 accuracy was obtained according to the AUC (Area Under the Curve)
success metric by using the neighborhood value of k = 3. In the literature, there is no study comparing
the KNN algorithm, neighborhood values and distance vectors together on food data sets using varying
training and test data. Therefore, it is thought that the study will make an important contribution to
the literature.
Publisher
Afyon Kocatepe Universitesi Fen Ve Muhendislik Bilimleri Dergisi
Reference26 articles.
1. Akbaş, Y., Berber, T., 2020. Yanık Görüntülerinin Bulanık Kümelenmesinde Uzaklık Ölçülerinin Başarımlarının Değerlendirilmesi. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi, 22, 639-647.
2. Bilgiçli, N., Soylu, S., 2017. Buğday ve Un Kalitesinin Sektörel Açıdan Değerlendirilmesi. Bahri Dağdaş Bitkisel Araştırma Dergisi, 5, 58-67.
3. Charytanowicz, M., Niewczas, J., Kulczycki, P., Kowalski, P.A., Lukasik, S., Zak, S. 2010. A Complete Gradient Clustering Algorithm for Features Analysis of X-ray Images. Information Technologies in Biomedicine, Springer-Verlag, Germany, 15-24.
4. Cheng Z., Yuan L., 2013. The application and research of fault detection based on PC-KNN in semiconductor batch process. 25th Chinese Control and Decision Conference (CCDC), 4209-4214
5. Cover, T.M., Hart, P.E., 1967. Nearest Neighbor Pattern Classification. IEEE Transactions on Information Theory, 13, 21-27.