Author:
Muhammad Noor Mathivanan Norsyela,Azura Md.Ghani Nor,Mohd Janor Roziah
Abstract
Product classification is the key issue in e-commerce domains. Many products are released to the market rapidly and to select the correct category in taxonomy for each product has become a challenging task. The application of classification model is useful to precisely classify the products. The study proposed a method to apply clustering prior to classification. This study has used a large-scale real-world data set to identify the efficiency of clustering technique to improve the classification model. The conventional text classification procedures are used in the study such as preprocessing, feature extraction and feature selection before applying the clustering technique. Results show that clustering technique improves the accuracy of the classification model. The best classification model for all three approaches which are classification model only, classification with hierarchical clustering and classification with K-means clustering is K-Nearest Neighbor (KNN) model. Even though the accuracy of the KNN models are the same across different approaches but the KNN model with K-means clustering had the shortest time of execution. Hence, applying K-means clustering prior to KNN model helps in reducing the computation time.
Publisher
Institute of Advanced Engineering and Science
Subject
Electrical and Electronic Engineering,Control and Optimization,Computer Networks and Communications,Hardware and Architecture,Instrumentation,Information Systems,Control and Systems Engineering,Computer Science (miscellaneous)
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献