A Memory-Aware Spark Cache Replacement Strategy-Reference-Cited by-同舟云学术

A Memory-Aware Spark Cache Replacement Strategy

Published:2022-11 Issue:6 Volume:23 Page:1185-1190
ISSN:1607-9264
Container-title:網際網路技術學刊
language:
Short-container-title:Journal of Internet Technology

Author:

Jingyu Zhang Jingyu Zhang,Jingyu Zhang Ruihan Zhang,Ruihan Zhang Osama Alfarraj,Osama Alfarraj Amr Tolba,Amr Tolba Gwang-Jun Kim

Abstract

<p>Spark is currently the most widely used distributed computing framework, and its key data abstraction concept, Resilient Distributed Dataset (RDD), brings significant performance improvements in big data computing. In application scenarios, Spark jobs often need to replace RDDs due to insufficient memory. Spark uses the Least Recently Used (LRU) algorithm by default as the cache replacement strategy. This algorithm only considers the most recent use time of RDDs as the replacement basis. This characteristic may cause the RDDs that need to be reused to be evicted when performing cache replacement, resulting in a decrease in Spark performance. In response to the above problems, this paper proposes a memory-aware Spark cache replacement strategy, which comprehensively considers the cluster memory usage, RDD size, RDD dependencies, usage times and other information when performing cache replacement and selects the RDDs to be evicted. Furthermore, this paper designs extensive corresponding experiments to test and analyze the performance of the memory-aware Spark cache replacement strategy. The experimental data show that the proposed strategy can improve the performance by up to 13% compared with the LRU algorithm in different scenarios.</p> <p> </p>

Publisher

Angle Publishing Co., Ltd.

Subject

Computer Networks and Communications,Software

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Design of distributed network intrusion prevention system based on Spark and P2DR models;Cluster Computing;2024-05-11

2. Few-Sample Anomaly Detection in Industrial Images With Edge Enhancement and Cascade Residual Feature Refinement;IEEE Transactions on Industrial Informatics;2024