Robust Text-to-Cypher Using Combination of BERT, GraphSAGE, and Transformer (CoBGT) Model
-
Published:2024-09-04
Issue:17
Volume:14
Page:7881
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
Tran Quoc-Bao-Huy1, Waheed Aagha Abdul1ORCID, Chung Sun-Tae1
Affiliation:
1. Department of Intelligent Systems, Soongsil University, Seoul 06978, Republic of Korea
Abstract
Graph databases have become essential for managing and analyzing complex data relationships, with Neo4j emerging as a leading player in this domain. Neo4j, a high-performance NoSQL graph database, excels in efficiently handling connected data, offering powerful querying capabilities through its Cypher query language. However, due to Cypher’s complexities, making it more accessible for nonexpert users requires translating natural language queries into Cypher. Thus, in this paper, we propose a text-to-Cypher model to effectively translate natural language queries into Cypher. In our proposed model, we combine several methods to enable nonexpert users to interact with graph databases using the English language. Our approach includes three modules: key-value extraction, relation–properties prediction, and Cypher query generation. For key-value extraction and relation–properties prediction, we leverage BERT and GraphSAGE to extract features from natural language. Finally, we use a Transformer model to generate the Cypher query from these features. Additionally, due to the lack of text-to-Cypher datasets, we introduced a new dataset that contains English questions querying information within a graph database, paired with corresponding Cypher query ground truths. This dataset aids future model learning, validation, and comparison on text-to-Cypher task. Through experiments and evaluations, we demonstrate that our model achieves high accuracy and efficiency when comparing with some well-known seq2seq model such as T5 and GPT2, with an 87.1% exact match score on the dataset.
Reference33 articles.
1. Introduction to graph databases;Reasoning Web International Summer School,2014 2. Cao, R., Chen, L., Chen, Z., Zhao, Y., Zhu, S., and Yu, K. (2021, January 1–6). LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and Non-Local Relations. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online. 3. Nadime, F., Alastair, G., Paolo, G., Leonid, L., Tobias, L., Victor, M., Stefan, P., Mats, R., Mats, R., and Petra, S. (2018, January 10–15). Cypher: An Evolving Query Language for Property Graphs. Proceedings of the 2018 International Conference on Management of Data, Houston, TX, USA. 4. On the Defense of Spoofing Countermeasures against Adversarial Attacks;Doan;IEEE Access,2023 5. Cisse, M., Adi, Y., Neverova, N., and Keshet, J. (2017, January 4–9). Houdini: Fooling Deep Structured Visual and Speech Recognition Models with Adversarial Examples. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA.
|
|