Abstract
AbstractExtensive usage of Internet based applications in day to day life has led to generation of huge amounts of data every minute. Apart from humans, data is generated by machines like sensors, satellite, CCTV etc. This huge collection of heterogeneous data is often referred as Big Data which can be processed to draw useful insights. Apache Hadoop has emerged has widely used open source software framework for Big Data Processing and it is a cluster of cooperative computers enabling distributed parallel processing. Hadoop Distributed File System is used to store data blocks replicated and spanned across different nodes. HDFS uses an AES based cryptographic techniques at block level which is transparent and end to end in nature. However cryptography provides security from unauthorized access to the data blocks, but a legitimate user can still harm the data. One such example was execution of malicious map reduce jar files by legitimate user which can harm the data in the HDFS. We developed a mechanism where every map reduce jar will be tested by our sandbox security to ensure the jar is not malicious and suspicious jar files are not allowed to process the data in the HDFS. This feature is not present in the existing Apache Hadoop framework and our work is made available in github for consideration and inclusion in the future versions of Apache Hadoop.
Publisher
Springer Science and Business Media LLC
Subject
Information Systems and Management,Computer Networks and Communications,Hardware and Architecture,Information Systems
Reference11 articles.
1. Yao Y, Gao H, Wang J, Sheng B, Mi N. New scheduling algorithms for improving performance and resource utilization in Hadoop YARN clusters. IEEE Transactions on Cloud Computing. 2019.
2. Ge, Yi, et al., “File storage processing in HDFS”, U.S. Patent No. 10,210,173, 19 Feb 2019.
3. Glushkova D, Jovanovic P, Abelló A. Mapreduce performance model for Hadoop 2. x. Information systems, vol 79. New Jersey: Elsevier; 2019. pp. 32–43.
4. Martis M, Pai NV, Pragathi RS, Rakshatha S, Dixit S. Comprehensive survey on Hadoop security. Emerging research in computing, information, communication and applications, vol 906. Springer: Singapore. AISC-. 2019. pp. 227–236.
5. Knox Gateway: REST API and Application Gateway for the Apache Hadoop Ecosystem. knox.apache.org. 2019.
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献