EverAnalyzer: A Self-Adjustable Big Data Management Platform Exploiting the Hadoop Ecosystem

Author:

Karamolegkos Panagiotis1,Mavrogiorgou Argyro1ORCID,Kiourtis Athanasios1ORCID,Kyriazis Dimosthenis1ORCID

Affiliation:

1. Department of Digital Systems, University of Piraeus, 18534 Piraeus, Greece

Abstract

Big Data is a phenomenon that affects today’s world, with new data being generated every second. Today’s enterprises face major challenges from the increasingly diverse data, as well as from indexing, searching, and analyzing such enormous amounts of data. In this context, several frameworks and libraries for processing and analyzing Big Data exist. Among those frameworks Hadoop MapReduce, Mahout, Spark, and MLlib appear to be the most popular, although it is unclear which of them best suits and performs in various data processing and analysis scenarios. This paper proposes EverAnalyzer, a self-adjustable Big Data management platform built to fill this gap by exploiting all of these frameworks. The platform is able to collect data both in a streaming and in a batch manner, utilizing the metadata obtained from its users’ processing and analytical processes applied to the collected data. Based on this metadata, the platform recommends the optimum framework for the data processing/analytical activities that the users aim to execute. To verify the platform’s efficiency, numerous experiments were carried out using 30 diverse datasets related to various diseases. The results revealed that EverAnalyzer correctly suggested the optimum framework in 80% of the cases, indicating that the platform made the best selections in the majority of the experiments.

Funder

European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH – CREATE – INNOVATE

Publisher

MDPI AG

Subject

Information Systems

Reference102 articles.

1. Statista (2022, September 16). Total Data Volume Worldwide 2010–2025. Available online: https://www.statista.com/statistics/871513/worldwide-data-created/.

2. Forbes (2022, January 17). Big Data Goes Big. Available online: https://www.forbes.com/sites/rkulkarni/2019/02/07/big-data-goes-big/?sh=5b985d0920d7.

3. A review paper on big data and Hadoop;Bhosale;IJSR,2014

4. Comparative Analysis of Various Tools for Data Mining and Big Data Mining;SangeethaLakshmi;IRJET,2019

5. (2022, September 17). Apache Hadoop Home Page. Available online: https://hadoop.apache.org/.

Cited by 3 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3