Enhancing Machine Learning Prediction in Cybersecurity Using Dynamic Feature Selector-Reference-Cited by-同舟云学术

Enhancing Machine Learning Prediction in Cybersecurity Using Dynamic Feature Selector

Published:2021-03-21 Issue:1 Volume:1 Page:199-218
ISSN:2624-800X
Container-title:Journal of Cybersecurity and Privacy
language:en
Short-container-title:JCP

Author:

Ahsan Mostofa,Gomes Rahul^ORCID,Chowdhury Md. Minhaz^ORCID,Nygard Kendall E.

Abstract

Machine learning algorithms are becoming very efficient in intrusion detection systems with their real time response and adaptive learning process. A robust machine learning model can be deployed for anomaly detection by using a comprehensive dataset with multiple attack types. Nowadays datasets contain many attributes. Such high dimensionality of datasets poses a significant challenge to information extraction in terms of time and space complexity. Moreover, having so many attributes may be a hindrance towards creation of a decision boundary due to noise in the dataset. Large scale data with redundant or insignificant features increases the computational time and often decreases goodness of fit which is a critical issue in cybersecurity. In this research, we have proposed and implemented an efficient feature selection algorithm to filter insignificant variables. Our proposed Dynamic Feature Selector (DFS) uses statistical analysis and feature importance tests to reduce model complexity and improve prediction accuracy. To evaluate DFS, we conducted experiments on two datasets used for cybersecurity research namely Network Security Laboratory (NSL-KDD) and University of New South Wales (UNSW-NB15). In the meta-learning stage, four algorithms were compared namely Bidirectional Long Short-Term Memory (Bi-LSTM), Gated Recurrent Units, Random Forest and a proposed Convolutional Neural Network and Long Short-Term Memory (CNN-LSTM) for accuracy estimation. For NSL-KDD, experiments revealed an increment in accuracy from 99.54% to 99.64% while reducing feature size of one-hot encoded features from 123 to 50. In UNSW-NB15 we observed an increase in accuracy from 90.98% to 92.46% while reducing feature size from 196 to 47. The proposed approach is thus able to achieve higher accuracy while significantly lowering number of features required for processing.

Publisher

MDPI AG

Subject

General Medicine

Link

https://www.mdpi.com/2624-800X/1/1/11/pdf

Reference69 articles.

1. Selection of relevant features and examples in machine learning

2. Bagging predictors

Cited by 39 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Effect of feature optimization on performance of machine learning models for predicting traffic incident duration;Engineering Applications of Artificial Intelligence;2024-05

2. Predicting voided computerized physician order entry in oral and maxillofacial surgery inpatients: development and validation of machine learning model;2024-01-18

3. Systematic Literature Review on Forecasting and Prediction of Technical Debt Evolution;2024

4. A New Approach to Data Analysis Using Machine Learning for Cybersecurity;Big Data and Cognitive Computing;2023-11-21

5. Deep Learning for Breast Cancer Prediction in the Era of Big Data: A Comparative Study of Gene Expression and DNA Methylation;2023 International Conference on Sustainable Communication Networks and Application (ICSCNA);2023-11-15