IMapC: Inner MAPping Combiner to Enhance the Performance of MapReduce in Hadoop-Reference-Cited by-同舟云学术

IMapC: Inner MAPping Combiner to Enhance the Performance of MapReduce in Hadoop

Published:2022-05-17 Issue:10 Volume:11 Page:1599
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Kavitha C.^ORCID,Srividhya S. R.^ORCID,Lai Wen-Cheng^ORCID,Mani Vinodhini

Abstract

Hadoop is a framework for storing and processing huge amounts of data. With HDFS, large data sets can be managed on commodity hardware. MapReduce is a programming model for processing vast amounts of data in parallel. Mapping and reducing can be performed by using the MapReduce programming framework. A very large amount of data is transferred from Mapper to Reducer without any filtering or recursion, resulting in overdrawn bandwidth. In this paper, we introduce an algorithm called Inner MAPping Combiner (IMapC) for the map phase. This algorithm in the Mapper combines the values of recurring keys. In order to test the efficiency of the algorithm, different approaches were tested. According to the test, MapReduce programs that are implemented with the Default Combiner (DC) of IMapC will be 70% more efficient than those that are implemented without one. To make computations significantly faster, this work can be combined with MapReduce.

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Link

https://www.mdpi.com/2079-9292/11/10/1599/pdf

Reference27 articles.

1. In-Memory Cache and Intra-Node Combiner Approaches for Optimizing Execution Time in High-Performance Computing

2. Task failure resilience technique for improving the performance of MapReduce in Hadoop

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Big Data Privacy Protection and Security Provisions of the Healthcare SecPri-BGMPOP Method in a Cloud Environment;Mathematics;2024-06-25

2. An HBase-Based Optimization Model for Distributed Medical Data Storage and Retrieval;Electronics;2023-02-16

3. An Efficient and Secure Big Data Storage in Cloud Environment by Using Triple Data Encryption Standard;Big Data and Cognitive Computing;2022-09-26

4. Performance Evaluation of Stateful Firewall-Enabled SDN with Flow-Based Scheduling for Distributed Controllers;Electronics;2022-09-22