Building Advanced Web Applications Using Data Ingestion and Data Processing Tools

Author:

Šprem Šimun1,Tomažin Nikola1,Matečić Jelena1,Horvat Marko2ORCID

Affiliation:

1. Syntio, Trg Dražena Petrovića 3, HR-10000 Zagreb, Croatia

2. Department of Applied Computing, Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, HR-10000 Zagreb, Croatia

Abstract

Today, advanced websites serve as robust data repositories that constantly collect various user-centered information and prepare it for subsequent processing. The data collected can include a wide range of important information from email addresses, usernames, and passwords to demographic information such as age, gender, and geographic location. User behavior metrics are also collected, including browsing history, click patterns, and time spent on pages, as well as different preferences like product selection, language preferences, and individual settings. Interactions, device information, transaction history, authentication data, communication logs, and various analytics and metrics contribute to the comprehensive range of user-centric information collected by websites. A method to systematically ingest and transfer such differently structured information to a central message broker is thoroughly described. In this context, a novel tool—Dataphos Publisher—for the creation of ready-to-digest data packages is presented. Data acquired from the message broker are employed for data quality analysis, storage, conversion, and downstream processing. A brief overview of the commonly used and freely available tools for data ingestion and processing is also provided.

Publisher

MDPI AG

Reference51 articles.

1. Meehan, J., Aslantas, C., Zdonik, S., Tatbul, N., and Du, J. (2017, January 8–11). Data Ingestion for the Connected World. Proceedings of the CIDR, Chaminade, CA, USA.

2. Data Mining with Big Data;Wu;IEEE Trans. Knowl. Data Eng.,2013

3. Big Data Ingestion and Preparation Tools;Alwidian;Mod. Appl. Sci.,2020

4. Big Data Analytics: Analysis of Features and Performance of Big Data Ingestion Tools;Popa;Inform. Econ.,2018

5. Bylund, A. (2023). Data Pipeline Design for Audit Analytics: Data Ingestion Tools Evaluation & Proof of Concept. [Master’s Thesis, Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics].

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3