Affiliation:
1. School of Software Engineering, Sun Yat-sen University, Zhuhai 528406, China
Abstract
Automated log parsing is essential for many log-mining applications, as logs provide a vast range of information on events and variations within an operating system or software at runtime. Over the years, various methods have been proposed for log parsing. With improved log-parsing methods, log-mining applications can gain deeper insights into system behaviors and identify anomalies or failures promptly. However, current log parsers still face limitations, such as insufficient parsing of log templates and a lack of parallelism, as well as inaccurate log template parsing. To overcome these limitations, we have designed Polo, a parser that leverages a prefix forest composed of ternary search trees to mine templates from logs. We then conducted extensive experiments to evaluate the accuracy of Polo on nine representative system logs, achieving an average accuracy of 0.987. It is 9.93% to 40.95% faster than the state-of-the-art parsing methods. Furthermore, we evaluated our approach on a downstream log analysis task, specifically anomaly detection. The experimental results demonstrated that, in terms of F1-score, our parser outperformed Deeplog, LogAnomaly, CNN, and LogRobust by 11.5%, 4%, 1%, and 19.1%, respectively, exhibiting a promising recall score of 0.971. These results indicate the effectiveness of Polo for anomaly detection.
Subject
General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)
Reference37 articles.
1. Chen, M., Zheng, A.X., Lloyd, J., Jordan, M.I., and Brewer, E. (2004, January 17–18). Failure diagnosis using decision trees. Proceedings of the International Conference on Autonomic Computing, New York, NY, USA.
2. Toward fine-grained, unsupervised, scalable performance diagnosis for production cloud computing systems;Mi;IEEE Trans. Parallel Distrib. Syst.,2013
3. Rapid deployment of anomaly detection models for large number of emerging kpi streams;Bu;J. Abbr.,2008
4. A systematic literature review on automated log abstraction techniques;Petrillo;Inf. Softw. Technol.,2020
5. A cascaded R-CNN with multiscale attention and imbalanced samples for traffic sign detection;Zhang;IEEE Access,2020
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献