Affiliation:
1. University of Washington, USA
Abstract
SQL is the de facto language for manipulating relational data. Though powerful, many users find it difficult to write SQL queries due to highly expressive constructs.
While using the programming-by-example paradigm to help users write SQL queries is an attractive proposition, as evidenced by online help forums such as Stack Overflow, developing techniques for synthesizing SQL queries from given input-output (I/O) examples has been difficult, due to the large space of SQL queries as a result of its rich set of operators.
In this paper, we present a new scalable and efficient algorithm for synthesizing SQL queries based on I/O examples. The key innovation of our algorithm is development of a language for abstract queries, i.e., queries with uninstantiated operators, that can be used to express a large space of SQL queries efficiently. Using abstract queries to represent the search space nicely decomposes the synthesis problem into two tasks: 1) searching for abstract queries that can potentially satisfy the given I/O examples, and 2) instantiating the found abstract queries and ranking the results.
We have implemented this algorithm in a new tool called Scythe and evaluated it using 193 benchmarks collected from Stack Overflow. Our evaluation shows that Scythe can efficiently solve 74% of the benchmarks, most in just a few seconds, and the queries range from simple ones involving a single selection to complex queries with 6 nested subqueires.
Funder
Defense Advanced Research Projects Agency
U.S. Department of Energy
National Science Foundation
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Cited by
27 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Gen-T: Table Reclamation in Data Lakes;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13
2. Synthesis of Bidirectional Programs from Examples with Functional Dependencies;Journal of Information Processing;2024
3. A SQL Synthesis System with Operator Handler;Proceedings of the 2023 7th International Conference on Computer Science and Artificial Intelligence;2023-12-08
4. Relational Query Synthesis ⋈ Decision Tree Learning;Proceedings of the VLDB Endowment;2023-10
5. Searching for explanations of black-box classifiers in the space of semantic queries;Semantic Web;2023-08-02