Affiliation:
1. George Mason University, Fairfax, VA
2. Army Research Laboratory, Adelphi, MD
Abstract
The analysts at a cybersecurity operations center (CSOC) analyze the alerts that are generated by intrusion detection systems (IDSs). Under normal operating conditions, sufficient numbers of analysts are available to analyze the alert workload. For the purpose of this article, this means that the cybersecurity analysts in each shift can fully investigate each and every alert that is generated by the IDSs in a reasonable amount of time and perform their normal tasks in a shift. Normal tasks include analysis time, time to attend training programs, report writing time, personal break time, and time to update the signatures on new patterns in alerts as detected by the IDS. There are several disruptive factors that occur randomly and can adversely impact the normal operating condition of a CSOC, such as (1) higher alert generation rates from a few IDSs, (2) new alert patterns that decrease the throughput of the alert analysis process, and (3) analyst absenteeism. The impact of the preceding factors is that the alerts wait for a long duration before being analyzed, which impacts the level of operational effectiveness (LOE) of the CSOC. To return the CSOC to normal operating conditions, the manager of a CSOC can take several actions, such as increasing the alert analysis time spent by analysts in a shift by canceling a training program, spending some of his own time to assist the analysts in alert investigation, and calling upon the on-call analyst workforce to boost the service rate of alerts. However, additional resources are limited in quantity over a 14-day work cycle, and the CSOC manager must determine when and how much action to take in the face of uncertainty, which arises from both the intensity and the random occurrences of the disruptive factors. The preceding decision by the CSOC manager is nontrivial and is often made in an ad hoc manner using prior experiences. This work develops a reinforcement learning (RL) model for optimizing the LOE throughout the entire 14-day work cycle of a CSOC in the face of uncertainties due to disruptive events. Results indicate that the RL model is able to assist the CSOC manager with a decision support tool to make better decisions than current practices in determining when and how much resource to allocate when the LOE of a CSOC deviates from the normal operating condition.
Publisher
Association for Computing Machinery (ACM)
Subject
Artificial Intelligence,Theoretical Computer Science
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献