Optimizing IoT Intrusion Detection Using Balanced Class Distribution, Feature Selection, and Ensemble Machine Learning Techniques
Author:
Musthafa Muhammad Bisri1ORCID, Huda Samsul2ORCID, Kodera Yuta1ORCID, Ali Md. Arshad3ORCID, Araki Shunsuke4ORCID, Mwaura Jedidah4, Nogami Yasuyuki1ORCID
Affiliation:
1. Graduate School of Environmental, Life, Natural Science and Technology, Okayama University, Okayama 700-8530, Japan 2. Green Innovation Center, Okayama University, Okayama 700-8530, Japan 3. Faculty of CSE, Hajee Mohammad Danesh Science and Technology University, Dinajpur 5200, Bangladesh 4. Graduate School of Computer Science and Systems Engineering, Kyushu Institute of Technology, Fukuoka 804-8550, Japan
Abstract
Internet of Things (IoT) devices are leading to advancements in innovation, efficiency, and sustainability across various industries. However, as the number of connected IoT devices increases, the risk of intrusion becomes a major concern in IoT security. To prevent intrusions, it is crucial to implement intrusion detection systems (IDSs) that can detect and prevent such attacks. IDSs are a critical component of cybersecurity infrastructure. They are designed to detect and respond to malicious activities within a network or system. Traditional IDS methods rely on predefined signatures or rules to identify known threats, but these techniques may struggle to detect novel or sophisticated attacks. The implementation of IDSs with machine learning (ML) and deep learning (DL) techniques has been proposed to improve IDSs’ ability to detect attacks. This will enhance overall cybersecurity posture and resilience. However, ML and DL techniques face several issues that may impact the models’ performance and effectiveness, such as overfitting and the effects of unimportant features on finding meaningful patterns. To ensure better performance and reliability of machine learning models in IDSs when dealing with new and unseen threats, the models need to be optimized. This can be done by addressing overfitting and implementing feature selection. In this paper, we propose a scheme to optimize IoT intrusion detection by using class balancing and feature selection for preprocessing. We evaluated the experiment on the UNSW-NB15 dataset and the NSL-KD dataset by implementing two different ensemble models: one using a support vector machine (SVM) with bagging and another using long short-term memory (LSTM) with stacking. The results of the performance and the confusion matrix show that the LSTM stacking with analysis of variance (ANOVA) feature selection model is a superior model for classifying network attacks. It has remarkable accuracies of 96.92% and 99.77% and overfitting values of 0.33% and 0.04% on the two datasets, respectively. The model’s ROC is also shaped with a sharp bend, with AUC values of 0.9665 and 0.9971 for the UNSW-NB15 dataset and the NSL-KD dataset, respectively.
Reference45 articles.
1. Anomaly detection and trust authority in artificial intelligence and cloud computing;Qureshi;Comput. Netw.,2021 2. IoT: Communication protocols and security threats;Gerodimos;Internet Things Cyber-Phys. Syst.,2023 3. Saurabh, K., Sood, S., Kumar, A.P., Singh, U., Vyas, R., Vyas, O.P., and Khondoker, R. (2022, January 6–9). LBDMIDS: LSTM based deep learning model for intrusion detection systems for IOT networks. Proceedings of the 2022 IEEE World AI IoT Congress (AIIoT), Seattle, WA, USA. 4. Henry, A., Gautam, S., Khanna, S., Rabie, K., Shongwe, T., Bhattacharya, P., Sharma, B., and Chowdhury, S. (2023). Composition of Hybrid Deep Learning Model and Feature Optimization for Intrusion Detection System. Sensors, 23. 5. Fitni, Q.R.S., and Ramli, K. (2020, January 7–8). Implementation of ensemble learning and feature selection for performance improvements in anomaly-based intrusion detection systems. Proceedings of the 2020 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT), Bali, Indonesia.
|
|