Evaluating Feature Selection Methods for Network Intrusion Detection with Kyoto Data-Reference-Cited by-同舟云学术

Evaluating Feature Selection Methods for Network Intrusion Detection with Kyoto Data

Published:2016-02 Issue:01 Volume:23 Page:1650001
ISSN:0218-5393
Container-title:International Journal of Reliability, Quality and Safety Engineering
language:en
Short-container-title:Int. J. Rel. Qual. Saf. Eng.

Author:

Najafabadi Maryam M.¹,Khoshgoftaar Taghi M.¹,Seliya Naeem¹

Affiliation:

1. Florida Atlantic University, Boca Raton, Florida, USA

Abstract

Considering the large quantity of the data flowing through the network routers, there is a very high demand to detect malicious and unhealthy network traffic to provide network users with reliable network operation and security of their information. Predictive models should be built to identify whether a network traffic record is healthy or malicious. To build such models, machine learning methods have started to be used for the task of network intrusion detection. Such predictive models must monitor and analyze a large amount of network data in a reasonable amount of time (usually real time). To do so, they cannot always process the whole data and there is a need for data reduction methods, which reduce the amount of data that needs to be processed. Feature selection is one of the data reduction methods that can be used to decrease the process time. It is important to understand which features are most relevant to determining if a network traffic record is malicious and avoid using the whole feature set to make the processing time more efficient. Also it is important that the simple model built from the reduced feature set be as effective as a model which uses all the features. Considering these facts, feature selection is a very important pre-processing step in the detection of network attacks. The goal is to remove irrelevant and redundant features in order to increase the overall effectiveness of an intrusion detection system without negatively affecting the classification performance. Most of the previous feature selection studies in the area of intrusion detection have been applied on the KDD 99 dataset. As KDD 99 is an outdated dataset, in this paper, we compare different feature selection methods on a relatively new dataset, called Kyoto 2006+. There is no comprehensive comparison of different feature selection approaches for this dataset. In the present work, we study four filter-based feature selection methods which are chosen from two categories for the application of network intrusion detection. Three filter-based feature rankers and one filter-based subset evaluation technique are compared together along with the null case which applies no feature selection. We also apply statistical analysis to determine whether performance differences between these feature selection methods are significant or not. We find that among all the feature selection methods, Signal-to-Noise (S2N) gives the best performance results. It also outperforms no feature selection approach in all the experiments.

Publisher

World Scientific Pub Co Pte Lt

Subject

Electrical and Electronic Engineering,Industrial and Manufacturing Engineering,Energy Engineering and Power Technology,Aerospace Engineering,Safety, Risk, Reliability and Quality,Nuclear Energy and Engineering,General Computer Science

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0218539316500017

Reference11 articles.

1. Practical real-time intrusion detection using machine learning approaches

2. An Introduction to Statistical Learning

3. Characterization and classification of malicious Web traffic

4. A hierarchical SOM-based intrusion detection system

Cited by 17 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Numerical Feature Selection and Hyperbolic Tangent Feature Scaling in Machine Learning-Based Detection of Anomalies in the Computer Network Behavior;Electronics;2023-10-07

2. Cybersecurity attacks: Which dataset should be used to evaluate an intrusion detection system?;Vojnotehnicki glasnik;2023

3. Machine learning to combat cyberattack: a survey of datasets and challenges;The Journal of Defense Modeling and Simulation: Applications, Methodology, Technology;2022-05-01

4. A Hybrid PSO-BPSO Based Kernel Extreme Learning Machine Model for Intrusion Detection;J INF PROCESS SYST;2022

5. IoT information theft prediction using ensemble feature selection;Journal of Big Data;2022-01-06