From natural language processing to neural databases-Reference-Cited by-同舟云学术

From natural language processing to neural databases

Published:2021-02 Issue:6 Volume:14 Page:1033-1039
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Thorne James¹,Yazdani Majid²,Saeidi Marzieh²,Silvestri Fabrizio²,Riedel Sebastian³,Halevy Alon²

Affiliation:

1. University of Cambridge and Facebook AI

2. Facebook AI

3. Facebook AI and University College London

Abstract

In recent years, neural networks have shown impressive performance gains on long-standing AI problems, such as answering queries from text and machine translation. These advances raise the question of whether neural nets can be used at the core of query processing to derive answers from facts, even when the facts are expressed in natural language. If so, it is conceivable that we could relax the fundamental assumption of database management, namely, that our data is represented as fields of a pre-defined schema. Furthermore, such technology would enable combining information from text, images, and structured data seamlessly. This paper introduces neural databases , a class of systems that use NLP transformers as localized answer derivation engines. We ground the vision in NeuralDB, a system for querying facts represented as short natural language sentences. We demonstrate that recent natural language processing models, specifically transformers, can answer select-project-join queries if they are given a set of relevant facts. However, they cannot scale to non-trivial databases nor answer set-based and aggregation queries. Based on these insights, we identify specific research challenges that are needed to build neural databases. Some of the challenges require drawing upon the rich literature in data management, and others pose new research opportunities to the NLP community. Finally, we show that with preliminary solutions, NeuralDB can already answer queries over thousands of sentences with very high accuracy.

Publisher

VLDB Endowment

Subject

General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development

Link

https://dl.acm.org/doi/pdf/10.14778/3447689.3447706

Cited by 20 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Demystifying Data Management for Large Language Models;Companion of the 2024 International Conference on Management of Data;2024-06-09

2. Large Language Models: Principles and Practice;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

3. A Multi-Task Learning Framework for Reading Comprehension of Scientific Tabular Data;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

4. DTT: An Example-Driven Tabular Transformer for Joinability by Leveraging Large Language Models;Proceedings of the ACM on Management of Data;2024-03-12

5. DB-BERT: making database tuning tools “read” the manual;The VLDB Journal;2023-12-27