Affiliation:
1. Università della Svizzera italiana, Lugano, Switzerland
2. Oracle Labs
Abstract
Language-integrated query (LINQ) frameworks offer a convenient programming abstraction for processing in-memory collections of data, allowing developers to concisely express declarative queries using general-purpose programming languages. Existing LINQ frameworks rely on the well-defined type system of statically-typed languages such as C
#
or Java to perform query compilation and execution. As a consequence of this design, they do not support dynamic languages such as Python, R, or JavaScript. Such languages are however very popular among data scientists, who would certainly benefit from LINQ frameworks in data analytics applications.
In this work we bridge the gap between dynamic languages and LINQ frameworks. We introduce DynQ, a novel query engine designed for dynamic languages. DynQ is language-agnostic, since it is able to execute SQL queries in a polyglot language runtime. Moreover, DynQ can execute queries combining data from multiple sources, namely in-memory object collections as well as on-file data and external database systems. Our evaluation of DynQ shows performance comparable with equivalent hand-optimized code, and in line with common data-processing libraries and embedded databases, making DynQ an appealing query engine for standalone analytics applications and for data-intensive server-side workloads.
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. DynQ: a dynamic query engine with query-reuse capabilities embedded in a polyglot runtime;The VLDB Journal;2023-03-13
2. SQL to Stream with S2S: An Automatic Benchmark Generator for the Java Stream API;Proceedings of the 21st ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences;2022-11-29
3. Automatic Array Transformation to Columnar Storage at Run Time;Proceedings of the 19th International Conference on Managed Programming Languages and Runtimes;2022-09-14
4. Columnar formats for schemaless LSM-based document stores;Proceedings of the VLDB Endowment;2022-06