VEDAS: an efficient GPU alternative for store and query of large RDF data sets-Reference-Cited by-同舟云学术

VEDAS: an efficient GPU alternative for store and query of large RDF data sets

Published:2021-09-16 Issue:1 Volume:8 Page:
ISSN:2196-1115
Container-title:Journal of Big Data
language:en
Short-container-title:J Big Data

Author:

Makpaisit Pisit,Chantrapornchai Chantana^ORCID

Abstract

AbstractResource Description Framework (RDF) is commonly used as a standard for data interchange on the web. The collection of RDF data sets can form a large graph which consumes time to query. It is known that modern Graphic Processing Units (GPUs) can be employed to execute parallel programs in order to speedup the running time. In this paper, we propose a novel RDF data representation along with the query processing algorithm that is suitable for GPU processing. Since the main challenges of GPU architecture are the limited memory sizes, the memory transfer latency, and the vast number of GPU cores. Our system is designed to strengthen the use of GPU cores and reduce the effect of memory transfer. We propose a representation consists of indices and column-based RDF ID data that can reduce the GPU memory requirement. The indexing and pre-upload filtering techniques are then applied to reduce the data transfer between the host and GPU memory. We add the index swapping process to facilitate the sorting and joining data process based on the given variable and add the pre-upload step to reduce the size of results’ storage, and the data transfer time. The experimental results show that our representation is about 35% smaller than the traditional NT format and 40% less compared to that of gStore. The query processing time can be speedup ranging from 1.95 to 397.03 when compared with RDF3X and gStore processing time with WatDiv test suite. It achieves speedup 578.57 and 62.97 for LUBM benchmark when compared to RDF-3X and gStore. The analysis shows the query cases which can gain benefits from our approach.

Funder

Thailand Research Fund

Kasetsart University Research and Development Institute

Publisher

Springer Science and Business Media LLC

Subject

Information Systems and Management,Computer Networks and Communications,Hardware and Architecture,Information Systems

Link

https://link.springer.com/content/pdf/10.1186/s40537-021-00513-y.pdf

Reference44 articles.

1. National Inventory of Natural Heritage: TAXONOMIC REPOSITORY TAXREF. https://inpn.mnhn.fr/programme/referentiel-taxonomique-taxref?lg=en. Accessed 20 Oct 2020.

2. IMATI - CNR: LusTRE: linked Thesaurus fRamework for Environment. http://purl.oclc.org/net/DumpEarthRDF. Accessed 20 Oct 2020.

3. Gerasimos Razis: influence Tracker Dataset. https://old.datahub.io/dataset/influence-tracker-dataset. Accessed 20 Oct 2020.

4. Research Group Agile Knowledge Engineering and Semantic Web (AKSW): USPTO patent data. https://old.datahub.io/dataset/linked-uspto-patent-data. Accessed 20 Oct 2020.

5. Wikipedia: DBpedia. https://en.wikipedia.org/wiki/DBpedia. Accessed 20 Oct 2020.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. MP-HTHEDL: A Massively Parallel Hypothesis Evaluation Engine in Description Logic;IEEE Access;2024

2. An efficient and scalable SPARQL query processing framework for big data using MapReduce and hybrid optimum load balancing;Data & Knowledge Engineering;2023-11