Abstract
AbstractThe exponential growth in computer networks and network applications worldwide has been matched by a surge in cyberattacks. For this reason, datasets such as CSE-CIC-IDS2018 were created to train predictive models on network-based intrusion detection. These datasets are not meant to serve as repositories for signature-based detection systems, but rather to promote research on anomaly-based detection through various machine learning approaches. CSE-CIC-IDS2018 contains about 16,000,000 instances collected over the course of ten days. It is the most recent intrusion detection dataset that is big data, publicly available, and covers a wide range of attack types. This multi-class dataset has a class imbalance, with roughly 17% of the instances comprising attack (anomalous) traffic. Our survey work contributes several key findings. We determined that the best performance scores for each study, where available, were unexpectedly high overall, which may be due to overfitting. We also found that most of the works did not address class imbalance, the effects of which can bias results in a big data study. Lastly, we discovered that information on the data cleaning of CSE-CIC-IDS2018 was inadequate across the board, a finding that may indicate problems with reproducibility of experiments. In our survey, major research gaps have also been identified.
Publisher
Springer Science and Business Media LLC
Subject
Information Systems and Management,Computer Networks and Communications,Hardware and Architecture,Information Systems
Reference104 articles.
1. Singh AP, Singh MD. Analysis of host-based and network-based intrusion detection system. IJ Comput Netw Inf Secur. 2014;8:41–7.
2. Patil A, Laturkar A, Athawale S, Takale R, Tathawade P. A multilevel system to mitigate ddos, brute force and sql injection attack for cloud security. In: International Conference on Information, Communication, Instrumentation and Control (ICICIC), 2017. p. 1–7. IEEE.
3. Saxena AK, Sinha S, Shukla P. General study of intrusion detection system and survey of agent based intrusion detection system. In: 2017 International Conference on Computing, Communication and Automation (ICCCA), 2017. p. 471–421. IEEE.
4. CNBC: Cyberattacks now cost companies $200,000 on average, putting many out of business. https://www.cnbc.com/2019/10/13/cyberattacks-cost-small-companies-200k-putting-many-out-of-business.html.
5. Sharafaldin I, Lashkari AH, Ghorbani AA. Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: ICISSP, 2018. p. 108–116.
Cited by
124 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献