Author:
Arbat Shivani,Jayakumar Vinodh Kumaran,Lee Jaewoo,Wang Wei,Kim In Kee
Abstract
Predictive VM (Virtual Machine) auto-scaling is a promising technique to optimize cloud applications’ operating costs and performance. Understanding the job arrival rate is crucial for accurately predicting future changes in cloud workloads and proactively provisioning and de-provisioning VMs for hosting the applications. However, developing a model that accurately predicts cloud workload changes is extremely challenging due to the dynamic nature of cloud workloads. Long- Short-Term-Memory (LSTM) models have been developed for cloud workload prediction. Unfortunately, the state-of-the-art LSTM model leverages recurrences to predict, which naturally adds complexity and increases the inference overhead as input sequences grow longer. To develop a cloud workload prediction model with high accuracy and low inference overhead, this work presents a novel time-series forecasting model called WGAN-gp Transformer, inspired by the Transformer network and improved Wasserstein-GANs. The proposed method adopts a Transformer network as a generator and a multi-layer perceptron as a critic. The extensive evaluations with real-world workload traces show WGAN- gp Transformer achieves 5× faster inference time with up to 5.1% higher prediction accuracy against the state-of-the-art. We also apply WGAN-gp Transformer to auto-scaling mechanisms on Google cloud platforms, and the WGAN-gp Transformer-based auto-scaling mechanism outperforms the LSTM-based mechanism by significantly reducing VM over-provisioning and under-provisioning rates.
Publisher
Association for the Advancement of Artificial Intelligence (AAAI)
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Improving Volatility Forecasting: A Study through Hybrid Deep Learning Methods with WGAN;Journal of Risk and Financial Management;2024-08-23
2. Parallel Task Scheduling in Autonomous Robotic Systems: An Event-Driven Multimodal Prediction Approach;Proceedings of the 53rd International Conference on Parallel Processing;2024-08-12
3. Exploiting sequence characteristics for long-term workload prediction in cloud data centers;Third International Conference on Algorithms, Microchips, and Network Applications (AMNA 2024);2024-06-08
4. A Brief Review on Prediction Methods for Cloud Resource Management;2024 9th IEEE International Conference on Smart Cloud (SmartCloud);2024-05-10
5. Computing Power Scheduling Method Based on Long-Term Workload Prediction for Computing Power Platform;2024 6th International Conference on Communications, Information System and Computer Engineering (CISCE);2024-05-10