LogFiT: Log Anomaly Detection using Fine-Tuned Language Models
Author:
Almodovar Crispin,Sabrina Fariza,Karimi Sarvnaz,Azad Salahuddin
Abstract
<p>System logs are a valuable source of information for monitoring and maintaining the security and stability of computer systems. Techniques based on Deep Learning and Natural Language Processing have demonstrated effectiveness in detecting abnormal behavior from these system logs. However existing approaches are inflexible and impractical: techniques that rely on log templates are unable to handle variability in log content, while classification-based approaches require labeled data for supervised training. In this paper, a novel log anomaly detection model named LogFiT is proposed. The LogFiT model is robust to changes in log content and only requires self-supervised training. The LogFiT model uses a pretrained BERT-based language model fine-tuned to recognise the linguistic patterns of the normal log data. The LogFiT model is trained using masked token prediction on the normal log data only. Consequently when presented with the new log data, the model's top-k token prediction accuracy is used as threshold for determining whether the new log data has deviated from the normal log data. Experimental results show that LogFiT's F1 score exceeds that of baselines on the HDFS, BGL and Thunderbird datasets. Critically, when variability in the log data is introduced during evaluation, LogFiT's effectiveness surpasses that of the baseline models.</p>
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献