Affiliation:
1. College of Computer Science Nankai University Tianjin China
2. State Key Laboratory of Computer Architecture Institute of Computing Technology, Chinese Academy of Sciences Beijing China
3. College of Cyber Science Nankai University Tianjin China
4. Department of Computer Science New Jersey Institute of Technology Newark New Jersey USA
Abstract
SummaryRecently, Kubernetes is widely used to manage and schedule the resources of microservices in cloud‐native distributed applications, as the most famous container orchestration framework. However, Kubernetes preferentially schedules microservices to nodes with rich and balanced CPU and memory resources on a single node. The native scheduler of Kubernetes, called Kube‐scheduler, may cause resource fragmentation and decrease resource utilization. In this paper, we propose a deep reinforcement learning enhanced Kubernetes scheduler named DRS. We initially frame the Kubernetes scheduling problem as a Markov decision process with intricately designed state, action, and reward structures in an effort to increase resource usage and decrease load imbalance. Then, we design and implement DRS mointor to perceive six parameters concerning resource utilization and create a thorough picture of all available resources globally. Finally, DRS can automatically learn the scheduling policy through interaction with the Kubernetes cluster, without relying on expert knowledge about workload and cluster status. We implement a prototype of DRS in a Kubernetes cluster with five nodes and evaluate its performance. Experimental results highlight that DRS overcomes the shortcomings of Kube‐scheduler and achieves the expected scheduling target with three workloads. With only 3.27% CPU overhead and 0.648% communication delay, DRS outperforms Kube‐scheduler by 27.29% in terms of resource utilization and reduces load imbalance by 2.90 times on average.
Funder
National Key Research and Development Program of China
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Telemetry-Driven Microservices Orchestration in Cloud-Edge Environments;2024 IEEE 17th International Conference on Cloud Computing (CLOUD);2024-07-07
2. Graph Attention Networks and Deep Q-Learning for Service Mesh Optimization: A Digital Twinning Approach;ICC 2024 - IEEE International Conference on Communications;2024-06-09
3. CAROKRS: Cost-Aware Resource Optimization Kubernetes Resource Scheduler;2024 9th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA);2024-04-25
4. Development of a Query Delay Injection System for the MEC Simulator of the LWMECPS Platform;2024 International Russian Smart Industry Conference (SmartIndustryCon);2024-03-25
5. ODRL: Reinforcement Learning in Priority Scheduling for Running Cost Optimization;2023 IEEE 29th International Conference on Parallel and Distributed Systems (ICPADS);2023-12-17