Affiliation:
1. NEC Laboratories America
2. Department of Computer Sciences, University of Wisconsin-Madison
Abstract
Query scheduling, a fundamental problem in database management systems, has recently received a renewed attention, perhaps in part due to the rise of the "database as a service" (DaaS) model for database deployment. While there has been a great deal of work investigating different scheduling algorithms, there has been comparatively little work investigating what the scheduling algorithms can or should know about the queries to be scheduled. In this work, we investigate the efficacy of using histograms describing the distribution of likely query execution times as input to the query scheduler. We propose a novel distribution-based scheduling algorithm, Shepherd, and show that Shepherd substantially outperforms state-of-the-art point-based methods through extensive experimentation with both synthetic and TPC workloads.
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. HTD: heterogeneous throughput-driven task scheduling algorithm in MapReduce;Distributed and Parallel Databases;2021-10-28
2. Quantifying the Influence of Intermittent Connectivity on Mobile Edge Computing;IEEE Transactions on Cloud Computing;2019
3. 3Sigma;Proceedings of the Thirteenth EuroSys Conference;2018-04-23
4. Distribution Based Workload Modelling of Continuous Queries in Clouds;IEEE Transactions on Emerging Topics in Computing;2017-01-01
5. Resource and Performance Distribution Prediction for Large Scale Analytics Queries;Proceedings of the 7th ACM/SPEC on International Conference on Performance Engineering;2016-03-12