Using Ensemble Learning for Anomaly Detection in Cyber–Physical Systems-Reference-Cited by-同舟云学术

Using Ensemble Learning for Anomaly Detection in Cyber–Physical Systems

Published:2024-04-07 Issue:7 Volume:13 Page:1391
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Jeffrey Nicholas¹^ORCID,Tan Qing²^ORCID,Villar José R.¹^ORCID

Affiliation:

1. Faculty of Computer Science, University of Oviedo, 33003 Oviedo, Spain

2. Faculty of Science and Technology, Athabasca University, Athabasca, AB T9S 3A3, Canada

Abstract

The swift embrace of Industry 4.0 paradigms has led to the growing convergence of Information Technology (IT) networks and Operational Technology (OT) networks. Traditionally isolated on air-gapped and fully trusted networks, OT networks are now becoming more interconnected with IT networks due to the advancement and applications of IoT. This expanded attack surface has led to vulnerabilities in Cyber–Physical Systems (CPSs), resulting in increasingly frequent compromises with substantial economic and life safety repercussions. The existing methods for the anomaly detection of security threats typically use simple threshold-based strategies or apply Machine Learning (ML) algorithms to historical data for the prediction of future anomalies. However, due to the high levels of heterogeneity across different CPS environments, minimizing the opportunities for transfer learning, and the scarcity of real-world data for training, the existing ML-based anomaly detection techniques suffer from a poor predictive performance. This paper introduces a hybrid anomaly detection approach designed to identify threats to CPSs by combining the signature-based anomaly detection typically utilized in IT networks, the threshold-based anomaly detection typically utilized in OT networks, and behavioural-based anomaly detection using Ensemble Learning (EL), which leverages the strengths of multiple ML algorithms against the same dataset to increase the accuracy. Multiple public research datasets were used to validate the proposed approach, with the hybrid methodology employing a divide-and-conquer strategy to offload the detection of certain cyber threats to computationally inexpensive signature-based and threshold-based methods using domain knowledge to minimize the size of the behavioural-based data needed for ML model training, thus achieving a higher accuracy over a reduced timeframe. The experimental results showed accuracy improvements of 4–7% over those of the conventional ML classifiers in performing anomaly detection across multiple datasets, which is particularly important to the operators of CPS environments due to the high financial and life safety costs associated with interruptions to system availability.

Funder

Spanish Ministry of Economics and Industry

Spanish Research Agency

Missions Science and Innovation

Principado de Asturias

Council of Gijón through the University Institute of Industrial Technology of Asturias

Fundación Universidad de Oviedo

Publisher

MDPI AG

Link

https://www.mdpi.com/2079-9292/13/7/1391/pdf

Reference32 articles.