Abstract
Supervised machine learning techniques are commonly used in many areas like finance, education, healthcare, engineering, etc. because of their ability to learn from past data. However, such techniques can be very slow if the dataset is high-dimensional, and also irrelevant features may reduce classification success. Therefore, feature selection or feature reduction techniques are commonly used to overcome the mentioned issues. On the other hand, information security for both people and networks is crucial, and it must be secured without wasting the time. Hence, feature selection approaches that can make the algorithms faster without reducing the classification success are needed. In this study, we compare both the classification success and run-time performance of state-of-the-art classification algorithms using standard deviation-based feature selection in the aspect of security datasets. For this purpose, we applied standard deviation-based feature selection to KDD Cup 99 and Phishing Legitimate datasets for selecting the most relevant features, and then we run the selected classification algorithms on the datasets to compare the results. According to the obtained results, while the classification success of all algorithms is satisfying Decision Tree (DT) was the best one among others. On the other hand, while Decision Tree, k Nearest Neighbors, and Naïve Bayes (BN) were sufficiently fast, Support Vector Machine (SVM) and Artificial Neural Networks (ANN or NN) were too slow.
Publisher
International Journal of Pure and Applied Sciences
Subject
Organic Chemistry,Biochemistry
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献