Affiliation:
1. Faculty of Computer Science Dalhousie University Halifax Nova Scotia Canada
2. 2Keys Corporation An Interac Company Ottawa Ontario Canada
Abstract
AbstractIn terms of cyber security, log files represent a rich source of information regarding the state of a computer service/system. Automating the process of summarizing log file content represents an important aid for decision‐making, especially given the 24/7 nature of network/service operations. We perform benchmarking over eight distinct log files in order to assess the impact of the following: (1) different embedding methods for developing semantic descriptions of the original log files, (2) applying dimension reduction to the high‐dimensional semantic space, and (3) assessing the impact of using different unsupervised learning algorithms for providing a visual summary of the service state. Benchmarking demonstrates that (1) word‐to‐vector embeddings identified by bidirectional encoder representation from transformers (BERT) without “fine‐tuning” are sufficient to match the performance of Bag‐or‐Words embeddings provided by term frequency‐inverse document frequency (TF‐IDF) and (2) the self‐organizing map without dimension reduction provides the most effective anomaly detector.
Subject
Computer Networks and Communications,Computer Science Applications
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献