Towards Supercomputing Categorizing the Maliciousness upon Cybersecurity Blacklists with Concept Drift-Reference-Cited by-同舟云学术

Towards Supercomputing Categorizing the Maliciousness upon Cybersecurity Blacklists with Concept Drift

Published:2023-05-20 Issue: Volume:2023 Page:1-8
ISSN:2577-7408
Container-title:Computational and Mathematical Methods
language:en
Short-container-title:Computational and Mathematical Methods

Author:

Carriegos M. V.¹^ORCID,DeCastro-García N.¹^ORCID,Escudero D.²^ORCID

Affiliation:

1. Departamento de Matemáticas, Universidad de León, León, Spain

2. RIASC, Instituto de Ciberseguridad, Universidad de León, León, Spain

Abstract

In this article, we have carried out a case study to optimize the classification of the maliciousness of cybersecurity events by IP addresses using machine learning techniques. The optimization is studied focusing on time complexity. Firstly, we have used the extreme gradient boosting model, and secondly, we have parallelized the machine learning algorithm to study the effect of using a different number of cores for the problem. We have classified the cybersecurity events’ maliciousness in a biclass and a multiclass scenario. All the experiments have been carried out with a well-known optimal set of features: the geolocation information of the IP address. However, the geolocation features of an IP address can change over time. Also, the relation between the IP address and its label of maliciousness can be modified if we test the address several times. Then, the models’ performance could degrade because the information acquired from training on past samples may not generalize well to new samples. This situation is known as concept drift. For this reason, it is necessary to study if the optimization proposed works in a concept drift scenario. The results show that the concept drift does not degrade the models. Also, boosting algorithms achieving competitive or better performance compared to similar research works for the biclass scenario and an effective categorization for the multiclass case. The best efficient setting is reached using five nodes regarding high-performance computation resources.

Funder

Spanish National Cybersecurity Institute

Publisher

Hindawi Limited

Subject

Computational Mathematics,Computational Theory and Mathematics,Computational Mechanics

Link

http://downloads.hindawi.com/journals/cmm/2023/5780357.pdf

Reference27 articles.

1. A mathematical analysis about the geo-temporal characterization of the multi-class maliciousness of an IP address

2. Characterizing concept drift

3. A survey on concept drift adaptation

4. Learning under Concept Drift: A Review

5. An effectiveness analysis of transfer learning for the concept drift problem in malware detection

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Transfer and online learning for IP maliciousness prediction in a concept drift scenario;Wireless Networks;2024-01-28