Affiliation:
1. Department of Mathematics and Physics, Catholic University of the Sacred Heart, 25121 Brescia, Italy
Abstract
The increasing sophistication of cyberattacks necessitates the development of advanced detection systems capable of accurately identifying and mitigating potential threats. This research addresses the critical challenge of cyberattack detection by employing a comprehensive approach that includes generating a realistic yet imbalanced dataset simulating various types of cyberattacks. Recognizing the inherent limitations posed by imbalanced data, we explored multiple data augmentation techniques to enhance the model’s learning effectiveness and ensure robust performance across different attack scenarios. Firstly, we constructed a detailed dataset reflecting real-world conditions of network intrusions by simulating a range of cyberattack types, ensuring it embodies the typical imbalances observed in genuine cybersecurity threats. Subsequently, we applied several data augmentation techniques, including SMOTE and ADASYN, to address the skew in class distribution, thereby providing a more balanced dataset for training supervised machine learning models. Our evaluation of these techniques across various models, such as Random Forests and Neural Networks, demonstrates significant improvements in detection capabilities. Moreover, the analysis also extends to the investigation of feature importance, providing critical insights into which attributes most significantly influence the predictive outcomes of the models. This not only enhances the interpretability of the models but also aids in refining feature engineering and selection processes to optimize performance.
Reference22 articles.
1. The role of machine learning in cybersecurity;Apruzzese;Digit. Threat. Res. Pract.,2023
2. The significance of machine learning and deep learning techniques in cybersecurity: A comprehensive review;Mijwil;Iraqi J. Comput. Sci. Math.,2023
3. Bagui, S., Mink, D., Bagui, S., Ghosh, T., McElroy, T., Paredes, E., Khasnavis, N., and Plenkers, R. (2022). Detecting reconnaissance and discovery tactics from the MITRE ATT&CK framework in Zeek conn logs using spark’s machine learning in the big data framework. Sensors, 22.
4. Anomaly-based intrusion detection by machine learning: A case study on probing attacks to an institutional network;Tufan;IEEE Access,2021
5. Recurrent deep learning-based feature fusion ensemble meta-classifier approach for intelligent network intrusion detection system;Ravi;Comput. Electr. Eng.,2022