Affiliation:
1. University of Texas at Austin, USA
Abstract
This paper presents a new technique for automatically synthesizing SQL queries from natural language (NL). At the core of our technique is a new NL-based program synthesis methodology that combines semantic parsing techniques from the NLP community with type-directed program synthesis and automated program repair. Starting with a program sketch obtained using standard parsing techniques, our approach involves an iterative refinement loop that alternates between probabilistic type inhabitation and automated sketch repair. We use the proposed idea to build an end-to-end system called SQLIZER that can synthesize SQL queries from natural language. Our method is fully automated, works for any database without requiring additional customization, and does not require users to know the underlying database schema. We evaluate our approach on over 450 natural language queries concerning three different databases, namely MAS, IMDB, and YELP. Our experiments show that the desired query is ranked within the top 5 candidates in close to 90% of the cases and that SQLIZER outperforms NALIR, a state-of-the-art tool that won a best paper award at VLDB'14.
Funder
Defense Advanced Research Projects Agency
Publisher
Association for Computing Machinery (ACM)
Subject
Safety, Risk, Reliability and Quality,Software
Reference59 articles.
1. From symptom to cause
2. Jonathan Berant Andrew Chou Roy Frostig and Percy Liang. 2013. Semantic Parsing on Freebase from Question-Answer Pairs. In EMNLP. 1533–1544. Jonathan Berant Andrew Chou Roy Frostig and Percy Liang. 2013. Semantic Parsing on Freebase from Question-Answer Pairs. In EMNLP. 1533–1544.
Cited by
138 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Intelligent Search Engine Tool for Querying Database Systems;International Journal of Mathematical, Engineering and Management Sciences;2024-08-01
2. SEREIA: document store exploration through keywords;Knowledge and Information Systems;2024-06-10
3. A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model's Accuracy for Question Answering on Enterprise SQL Databases;Proceedings of the 7th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA);2024-06-09
4. CodeS: Towards Building Open-source Language Models for Text-to-SQL;Proceedings of the ACM on Management of Data;2024-05-29
5. Synthesis of Allowlists for Runtime Protection against SQLi;Proceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results;2024-04-14