Abstract
AbstractThe co-allocated data centers are to deploy online services and offline workloads in the same cluster to improve the utilization of resources. Spark application is a typical offline batch workload. At present, the resource scheduling strategy for co-allocated data centers mainly focuses on online services. Spark applications still use the original resource scheduling, which can’t solve the data dependency and deadline problems between spark applications and online services. This paper proposes a data-aware resource-scheduling model to meet the deadline requirement of Spark application and optimize the throughput of data processing on the premise of ensuring the quality of service of online services.
Publisher
Springer Nature Singapore
Reference15 articles.
1. Gantz, B.J., Reinsel, D., Shadows, B.D.: Big data, bigger digital shadows, and biggest growth in the far east executive summary: a universe of opportunities and challenges. Idc 1–16 (2007)
2. Delimitrou, C., Kozyrakis, C.: Quasar: resource-efficient and QoS-aware cluster management. ACM SIGPLAN Notices 49(4), 127–144 (2014)
3. Tang, Z., Zhou, J., Li, K., Li, R.: MTSD: a task-scheduling algorithm for MapReduce base on deadline constraints. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum. IEEE (2012)
4. Wang, K., Khan, M.M.H., Nguyen, N.: A dynamic resource allocation framework for apache spark applications. In: 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), pp. 997-1004. IEEE (2020)
5. Hu, Z., Li, D., Guo, D.: Balance resource allocation for spark jobs based on prediction of the optimal resource. Tsinghua Sci. Technol. 25(4), 487–497 (2020)
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献