Detecting web attacks using random undersampling and ensemble learners-Reference-Cited by-同舟云学术

Detecting web attacks using random undersampling and ensemble learners

Published:2021-05-27 Issue:1 Volume:8 Page:
ISSN:2196-1115
Container-title:Journal of Big Data
language:en
Short-container-title:J Big Data

Author:

Zuech Richard^ORCID,Hancock John,Khoshgoftaar Taghi M.

Abstract

AbstractClass imbalance is an important consideration for cybersecurity and machine learning. We explore classification performance in detecting web attacks in the recent CSE-CIC-IDS2018 dataset. This study considers a total of eight random undersampling (RUS) ratios: no sampling, 999:1, 99:1, 95:5, 9:1, 3:1, 65:35, and 1:1. Additionally, seven different classifiers are employed: Decision Tree (DT), Random Forest (RF), CatBoost (CB), LightGBM (LGB), XGBoost (XGB), Naive Bayes (NB), and Logistic Regression (LR). For classification performance metrics, Area Under the Receiver Operating Characteristic Curve (AUC) and Area Under the Precision-Recall Curve (AUPRC) are both utilized to answer the following three research questions. The first question asks: “Are various random undersampling ratios statistically different from each other in detecting web attacks?” The second question asks: “Are different classifiers statistically different from each other in detecting web attacks?” And, our third question asks: “Is the interaction between different classifiers and random undersampling ratios significant for detecting web attacks?” Based on our experiments, the answers to all three research questions is “Yes”. To the best of our knowledge, we are the first to apply random undersampling techniques to web attacks from the CSE-CIC-IDS2018 dataset while exploring various sampling ratios.

Publisher

Springer Science and Business Media LLC

Subject

Information Systems and Management,Computer Networks and Communications,Hardware and Architecture,Information Systems

Link

https://link.springer.com/content/pdf/10.1186/s40537-021-00460-8.pdf

Reference50 articles.

1. Young J. US Ecommerce Sales Grow 14.9% in 2019. Accessed: 2020-11-28. https://www.digitalcommerce360.com/article/us-ecommerce-sales/

2. Radanliev P, De Roure D, Walton R, Van Kleek M, Montalvo RM, Santos O, Burnap P, Anthi E, et al. Artificial intelligence and machine learning in dynamic cyber risk analytics at the edge. SN Applied Sciences. 2020;2(11):1–8.

3. Leevy JL, Hancock J, Zuech R, Khoshgoftaar TM. Detecting cybersecurity attacks across different network features and learners. Journal of Big Data. 2021;8(1):1–29.

4. Wald R, Villanustre F, Khoshgoftaar TM, Zuech R, Robinson J, Muharemagic E. Using feature selection and classification to build effective and efficient firewalls. In: Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014), 2014;pp. 850–854 . IEEE

5. Najafabadi MM, Khoshgoftaar TM, Seliya N. Evaluating feature selection methods for network intrusion detection with kyoto data. International Journal of Reliability, Quality and Safety Engineering. 2016;23(01):1650001.

Cited by 43 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Cost-sensitive stacked long short-term memory with an evolutionary framework for minority class detection;Applied Soft Computing;2024-11

2. Securing the Internet of Health Things: Embedded Federated Learning-Driven Long Short-Term Memory for Cyberattack Detection;Electronics;2024-08-31

3. Ex-ante expected changes in ESG and future stock returns based on machine learning;The British Accounting Review;2024-08

4. Supply chain fraud prediction with machine learning and artificial intelligence;International Journal of Production Research;2024-06-07

5. A review of ensemble learning and data augmentation models for class imbalanced problems: Combination, implementation and evaluation;Expert Systems with Applications;2024-06