Author:
Protić Danijela,Stanković Miomir
Abstract
Anomaly-based intrusion detection system detects intrusion to the computer network based on a reference model that has to be able to identify its normal behavior and flag what is not normal. In this process network traffic is classified into two groups by adding different labels to normal and malicious behavior. Main disadvantage of anomaly-based intrusion detection system is necessity to learn the difference between normal and not normal. Another disadvantage is the complexity of datasets which simulate realistic network traffic. Feature selection and normalization can be used to reduce data complexity and decrease processing runtime by selecting a better feature space This paper presents the results of testing the influence of feature selection and instances normalization to the classification performances of k-nearest neighbor, weighted k-nearest neighbor, support vector machines and decision tree models on 10 days records of the Kyoto 2006+ dataset. The data was pre-processed to remove all categorical features from the dataset. The resulting subset contained 17 features. Features containing instances which could not be normalized into the range [-1, 1] have also been removed. The resulting subset consisted of nine features. The feature ‘Label’ categorized network traffic to two classes: normal (1) and malicious (0). The performance metric to evaluate models was accuracy. Proposed method resulted in very high accuracy values with Decision Tree giving highest values for not-normalized and with k-nearest neighbor giving highest values for normalized data.Keywords: feature selection, normalization, k-NN, weighted k-NN, SVM, decision tree, Kyoto 2006+
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献