Author:
Beschastnikh Ivan,Wang Patty,Brun Yuriy,Ernst Michael D,
Abstract
Distributed systems pose unique challenges for software developers. Reasoning about concurrent activities of system nodes and even understanding the system’s communication topology can be difficult. A standard approach to gaining insight into system activity is to analyze system logs. Unfortunately, this can be a tedious and complex process. This article looks at several key features and debugging challenges that differentiate distributed systems from other kinds of software. The article presents several promising tools and ongoing research to help resolve these challenges.
Publisher
Association for Computing Machinery (ACM)
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Provenance-enhanced Root Cause Analysis for Jupyter Notebooks;2022 IEEE/ACM 15th International Conference on Utility and Cloud Computing (UCC);2022-12
2. Localizing and Explaining Faults in Microservices Using Distributed Tracing;2022 IEEE 15th International Conference on Cloud Computing (CLOUD);2022-07
3. Log Discovery for Troubleshooting Open Distributed Systems with TLQ;Practice and Experience in Advanced Research Computing;2020-07-26
4. Semantics-driven extraction of timed automata from Java programs;Empirical Software Engineering;2019-03-22
5. Autograding Distributed Algorithms in Networked Containers;Proceedings of the 50th ACM Technical Symposium on Computer Science Education;2019-02-22