Telemetry to solve dynamic analysis of a distributed system

Author:

Talaver Oleh V.ORCID,Vakaliuk Tetiana A.ORCID

Abstract

In the modern software development world, implementing distributed solutions has become quite common due to the flexibility it brings to big companies. The downside is that when developing such systems, especially in many teams, global design problems may not be obvious and lead to a slowdown in the development process or even problems with the location of errors or degradation of overall system performance. In addition, the timely reaction to system degradation is complicated by the distributed nature of the architecture; while manually configuring rules for reporting problematic situations can be time-consuming and still incomplete, automatic detection of possible system anomalies will give engineers (especially Software Reliability Engineers) the focus on problems. For this reason, applications that can dynamically analyse the system for problems have great potential. Currently, the topic of using telemetry for system analysis is actively studied and gaining traction, so further research is valuable. The work aims to theoretically and practically prove the possibility of using telemetry to analyse a distributed information system and detect harmful architectural practices and anomalous events. To do this, firstly, a detailed overview of the problems related to the topic and the feasibility of using telemetry is provided; the next section briefly describes the history of the development of monitoring systems and the key points of the latest OpenTelemetry standard, reviews popular application performance monitoring systems, and defines innovative features to be further researched. The main part includes an explanation of the approach used to collect and process telemetry, a reasoning behind the usage of Neo4j as a data storage solution, a practical overview of graph theory algorithms that help in the analysis of the collected data, and a description outlining how the PCA algorithm is employed to detect unusual situations in the whole system instead of individual metrics. The results provide an example of using the software presented with Neo4j Bloom to visualise and analyse the data collected over several hours from the OpenTelemetry Demo test system. The last section contains additional remarks on the results of the study.

Publisher

Academy of Cognitive and Natural Sciences

Reference32 articles.

1. Boone, N.D., 2017. Dynamic Baseline Alerts Now Automatically Find the Best Algorithm for You. Available from: https://newrelic.com/blog/how-to-relic/baseline-alerts-algorithm.

2. Brownlee, J., 2020. A Gentle Introduction to Exponential Smoothing for Time Series Forecasting in Python. Available from: https://machinelearningmastery.com/exponential-smoothing-for-time-series-forecasting-in-python/.

3. From Monolithic to Microservices: An Experience Report from the Banking Domain

4. Microservice Architecture Reconstruction and Visualization Techniques: A Review

5. Research on Architecting Microservices: Trends, Focus, and Potential for Industrial Adoption

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Editorial for JEC Volume 3 Issue 1 (2024);Journal of Edge Computing;2024-05-21

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3