Abstract
Hadoop is a framework for storing and processing huge amounts of data. With HDFS, large data sets can be managed on commodity hardware. MapReduce is a programming model for processing vast amounts of data in parallel. Mapping and reducing can be performed by using the MapReduce programming framework. A very large amount of data is transferred from Mapper to Reducer without any filtering or recursion, resulting in overdrawn bandwidth. In this paper, we introduce an algorithm called Inner MAPping Combiner (IMapC) for the map phase. This algorithm in the Mapper combines the values of recurring keys. In order to test the efficiency of the algorithm, different approaches were tested. According to the test, MapReduce programs that are implemented with the Default Combiner (DC) of IMapC will be 70% more efficient than those that are implemented without one. To make computations significantly faster, this work can be combined with MapReduce.
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献