ScienceBenchmark: A Complex Real-World Benchmark for Evaluating Natural Language to SQL Systems-Reference-Cited by-同舟云学术

ScienceBenchmark: A Complex Real-World Benchmark for Evaluating Natural Language to SQL Systems

Published:2023-12 Issue:4 Volume:17 Page:685-698
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Zhang Yi¹,Deriu Jan¹,Katsogiannis-Meimarakis George²,Kosten Catherine¹,Koutrika Georgia²,Stockinger Kurt¹

Affiliation:

1. Zurich University of Applied Sciences, Switzerland

2. Athena Research Center, Greece

Abstract

Natural Language to SQL systems (NL-to-SQL) have recently shown improved accuracy (exceeding 80%) for natural language to SQL query translation due to the emergence of transformer-based language models, and the popularity of the Spider benchmark. However, Spider mainly contains simple databases with few tables, columns, and entries, which do not reflect a realistic setting. Moreover, complex real-world databases with domain-specific content have little to no training data available in the form of NL/SQL-pairs leading to poor performance of existing NL-to-SQL systems. In this paper, we introduce ScienceBenchmark , a new complex NL-to-SQL benchmark for three real-world, highly domain-specific databases. For this new benchmark, SQL experts and domain experts created high-quality NL/SQL-pairs for each domain. To garner more data, we extended the small amount of human-generated data with synthetic data generated using GPT-3. We show that our benchmark is highly challenging, as the top performing systems on Spider achieve a very low performance on our benchmark. Thus, the challenge is many-fold: creating NL-to-SQL systems for highly complex domains with a small amount of hand-made training data augmented with synthetic data. To our knowledge, ScienceBenchmark is the first NL-to-SQL benchmark designed with complex real-world scientific databases, containing challenging training and test data carefully validated by domain experts.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.14778/3636218.3636225

Reference53 articles.

1. A comparative survey of recent natural language interfaces for databases

2. INODE

3. Natural language interfaces to databases – an introduction

4. Natural language interfaces to databases – an introduction

5. Lukas Blunschi, Claudio Jossen, Donald Kossman, Magdalini Mori, and Kurt Stockinger. 2012. Soda: Generating sql for business users. arXiv preprint arXiv:1207.0134 (2012).

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Metasql: A Generate-Then-Rank Framework for Natural Language to SQL Translation;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13