Author:
Narayanan Shammy,S Maheswari,Zephan Prisha
Abstract
Data pipelines are crucial for processing and transforming data in various domains, including finance, healthcare, and e-commerce. Ensuring the reliability and accuracy of data pipelines is of utmost importance to maintain data integrity and make informed business decisions. In this paper, we explore the significance of continuous monitoring in data pipelines and its contribution to data observability. This work discusses the challenges associated with monitoring data pipelines in real-time, propose a framework for real-time monitoring, and highlight its benefits in enhancing data observability. The findings of this work emphasize the need for organizations to adopt continuous monitoring practices to ensure data quality, detect anomalies, and improve overall system performance.
Publisher
European Alliance for Innovation n.o.
Reference16 articles.
1. Dwyer, M, Hwang, J, Shires, A, Cohen J. Application of Comprehensive Data Analysis for Interactive, Hierarchical Views of HPC Workloads. IEEE International Conference on Big Data. 2018:3585-3589.
2. Lachner, C, Laufer, J, Dustdar, S, Pohl, K. A Data Protection Focused Adaptation Engine for Distributed Video Analytics Pipelines. IEEE Access. 2022:10: 68669-68685.
3. Hu, H, Wen, Y, Chua T. –S, Li, X. Toward Scalable Systems for Big Data Analytics. A Technology Tutorial. IEEE Access. 2014: 2: 652-687.
4. Kulkarni, A. R, Kumar, N, Rao K. R. Efficacy of Bluetooth-Based Data Collection for Road Traffic Analysis and Visualization Using Big Data Analytics. Big Data Mining and Analytics. 2023: 6:139-153.
5. Icilia, MÁ, García – Barriocanal, E, Sánchez – Alonso, S, Mora – Cantallops, M, Cuadrado, JJ. Ontologies for Data Science On Its Application to Data Pipelines. Metadata and Semantic Research. Communications in Computer and Information Science. 2018; 846: 1-8