Evaluation of Machine Learning Algorithms in Predicting the Next SQL Query from the Future-Reference-Cited by-同舟云学术

Evaluation of Machine Learning Algorithms in Predicting the Next SQL Query from the Future

Published:2021-04 Issue:1 Volume:46 Page:1-46
ISSN:0362-5915
Container-title:ACM Transactions on Database Systems
language:en
Short-container-title:ACM Trans. Database Syst.

Author:

Meduri Venkata Vamsikrishna¹,Chowdhury Kanchan¹,Sarwat Mohamed¹

Affiliation:

1. Arizona State University, Tempe, AZ, USA

Abstract

Prediction of the next SQL query from the user, given her sequence of queries until the current timestep, during an ongoing interaction session of the user with the database, can help in speculative query processing and increased interactivity. While existing machine learning-- (ML) based approaches use recommender systems to suggest relevant queries to a user, there has been no exhaustive study on applying temporal predictors to predict the next user issued query. In this work, we experimentally compare ML algorithms in predicting the immediate next future query in an interaction workload, given the current user query or the sequence of queries in a user session thus far. As a part of this, we propose the adaptation of two powerful temporal predictors: (a) Recurrent Neural Networks (RNNs) and (b) a Reinforcement Learning approach called Q-Learning that uses Markov Decision Processes. We represent each query as a comprehensive set of fragment embeddings that not only captures the SQL operators, attributes, and relations but also the arithmetic comparison operators and constants that occur in the query. Our experiments on two real-world datasets show the effectiveness of temporal predictors against the baseline recommender systems in predicting the structural fragments in a query w.r.t. both quality and time. Besides showing that RNNs can be used to synthesize novel queries, we find that exact Q-Learning outperforms RNNs despite predicting the next query entirely from the historical query logs.

Funder

NSF

Publisher

Association for Computing Machinery (ACM)

Subject

Information Systems

Link

https://dl.acm.org/doi/pdf/10.1145/3442338

Reference51 articles.

1. 2011. JSQLParser. Retrieved from https://github.com/JSQLParser/JSqlParser. 2011. JSQLParser. Retrieved from https://github.com/JSQLParser/JSqlParser.

2. Martín Abadi Ashish Agarwal Paul Barham Eugene Brevdo Zhifeng Chen Craig Citro Greg S. Corrado Andy Davis Jeffrey Dean Matthieu Devin Sanjay Ghemawat Ian Goodfellow Andrew Harp Geoffrey Irving Michael Isard Yangqing Jia Rafal Jozefowicz Lukasz Kaiser Manjunath Kudlur Josh Levenberg Dan Mané Rajat Monga Sherry Moore Derek Murray Chris Olah Mike Schuster Jonathon Shlens Benoit Steiner Ilya Sutskever Kunal Talwar Paul Tucker Vincent Vanhoucke Vijay Vasudevan Fernanda Viégas Oriol Vinyals Pete Warden Martin Wattenberg Martin Wicke Yuan Yu and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Retrieved from http://tensorflow.org/. Martín Abadi Ashish Agarwal Paul Barham Eugene Brevdo Zhifeng Chen Craig Citro Greg S. Corrado Andy Davis Jeffrey Dean Matthieu Devin Sanjay Ghemawat Ian Goodfellow Andrew Harp Geoffrey Irving Michael Isard Yangqing Jia Rafal Jozefowicz Lukasz Kaiser Manjunath Kudlur Josh Levenberg Dan Mané Rajat Monga Sherry Moore Derek Murray Chris Olah Mike Schuster Jonathon Shlens Benoit Steiner Ilya Sutskever Kunal Talwar Paul Tucker Vincent Vanhoucke Vijay Vasudevan Fernanda Viégas Oriol Vinyals Pete Warden Martin Wattenberg Martin Wicke Yuan Yu and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Retrieved from http://tensorflow.org/.

3. BlinkDB

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Vertically Autoscaling Monolithic Applications with CaaSPER: Scalable C ontainer- a s- a - S ervice P erformance E nhanced R esizing Algorithm for the Cloud;Companion of the 2024 International Conference on Management of Data;2024-06-09

2. Log Replaying for Real-Time HTAP: An Adaptive Epoch-Based Two-Stage Framework;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

3. Sibyl: Forecasting Time-Evolving Query Workloads;Proceedings of the ACM on Management of Data;2024-03-12

4. An Analysis of AI-based SQL Injection (SQLi) Attack Detection;2023 Second International Conference on Augmented Intelligence and Sustainable Systems (ICAISS);2023-08-23

5. Predicting the Future Actions of People in the Real World to Improve Health Management;Artificial Intelligence in Data and Big Data Processing;2022