Speeding up symbolic reasoning for relational queries-Reference-Cited by-同舟云学术

Speeding up symbolic reasoning for relational queries

Published:2018-10-24 Issue:OOPSLA Volume:2 Page:1-25
ISSN:2475-1421
Container-title:Proceedings of the ACM on Programming Languages
language:en
Short-container-title:Proc. ACM Program. Lang.

Author:

Wang Chenglong¹,Cheung Alvin¹,Bodik Rastislav¹

Affiliation:

1. University of Washington, USA

Abstract

The ability to reason about relational queries plays an important role across many types of database applications, such as test data generation, query equivalence checking, and computer-assisted query authoring. Unfortunately, symbolic reasoning about relational queries can be challenging because relational tables are multisets (bags) of tuples, and the underlying languages, such as SQL, can introduce complex computation among tuples. We propose a space refinement algorithm that soundly reduces the space of tables such applications need to consider. The refinement procedure, independent of the specific dataset application, uses the abstract semantics of the query language to exploit the provenance of tuples in the query output to prune the search space. We implemented the refinement algorithm and evaluated it on SQL using three reasoning tasks: bounded query equivalence checking, test generation for applications that manipulate relational data, and concolic testing of database applications. Using real world benchmarks, we show that our refinement algorithm significantly speeds up (up to 100×) the SQL solver when reasoning about a large class of challenging SQL queries, such as those with aggregations.

Funder

Defense Advanced Research Projects Agency

National Science Foundation

U.S. Department of Energy

Publisher

Association for Computing Machinery (ACM)

Subject

Safety, Risk, Reliability and Quality,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3276527

Reference34 articles.

1. The XDa-TA system for automated grading of SQL query assignments

2. Lineage retrieval for scientific data processing: a survey

3. Provenance management in curated databases

4. Data generation for testing and grading SQL queries

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. VeriEQL: Bounded Equivalence Verification for Complex SQL Queries with Integrity Constraints;Proceedings of the ACM on Programming Languages;2024-04-29

2. Predicate Pushdown for Data Science Pipelines;Proceedings of the ACM on Management of Data;2023-06-13

3. Verifying Data Constraint Equivalence in FinTech Systems;2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE);2023-05

4. Active Learning for Inference and Regeneration of Applications that Access Databases;ACM Transactions on Programming Languages and Systems;2021-02

5. Provenance-guided synthesis of Datalog programs;Proceedings of the ACM on Programming Languages;2020-01