SQL/MapReduce-Reference-Cited by-同舟云学术

SQL/MapReduce

Published:2009-08 Issue:2 Volume:2 Page:1402-1413
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Friedman Eric¹,Pawlowski Peter¹,Cieslewicz John¹

Affiliation:

1. Aster Data Systems

Abstract

A user-defined function (UDF) is a powerful database feature that allows users to customize database functionality. Though useful, present UDFs have numerous limitations, including install-time specification of input and output schema and poor ability to parallelize execution. We present a new approach to implementing a UDF, which we call SQL/MapReduce (SQL/MR), that overcomes many of these limitations. We leverage ideas from the MapReduce programming paradigm to provide users with a straightforward API through which they can implement a UDF in the language of their choice. Moreover, our approach allows maximum flexibility as the output schema of the UDF is specified by the function itself at query plan-time . This means that a SQL/MR function is polymorphic. It can process arbitrary input because its behavior as well as output schema are dynamically determined by information available at query plan-time, such as the function's input schema and arbitrary user-provided parameters. This also increases reusability as the same SQL/MR function can be used on inputs with many different schemas or with different user-specified parameters. In this paper we describe the motivation for this new approach to UDFs as well as the implementation within Aster Data Systems' n Cluster database. We demonstrate that in the context of massively parallel, shared-nothing database systems, this model of computation facilitates highly scalable computation within the database. We also include examples of new applications that take advantage of this novel UDF framework.

Publisher

VLDB Endowment

Subject

General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development

Link

https://dl.acm.org/doi/pdf/10.14778/1687553.1687567

Cited by 62 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. On Reasoning About Black-Box Udfs by Classifying their Performance Characteristics;International Conference on Information Systems Development;2024-09-09

2. To UDFs and Beyond: Demonstration of a Fully Decomposed Data Processor for General Data Wrangling Tasks;Proceedings of the VLDB Endowment;2023-08

3. User-Defined Functions in Modern Data Engines;2023 IEEE 39th International Conference on Data Engineering (ICDE);2023-04

4. Data Integration Revitalized: From Data Warehouse Through Data Lake to Data Mesh;Lecture Notes in Computer Science;2023

5. Meta's next-generation realtime monitoring and analytics platform;Proceedings of the VLDB Endowment;2022-08