Abstract
Various resource description framework (RDF) partitioning methods have been studied for the efficient distributed processing of a large RDF graph. The RDF graph has symmetrical characteristics because subject and object can be used interchangeably if predicate is changed. This paper proposes a dynamic partitioning method of RDF graphs to support load balancing in distributed environments where data insertion and change continue to occur. The proposed method generates clusters and subclusters by considering the usage frequency of the RDF graph that are used by queries as the criteria to perform graph partitioning. It creates a cluster by grouping RDF subgraphs with higher usage frequency while creating a subcluster with lower usage frequency. These clusters and subclusters conduct load balancing by using the mean frequency of queries for the distributed server and conduct graph data partitioning by considering the size of the data stored in each distributed server. It also minimizes the number of edge-cuts connected to clusters and subclusters to minimize communication costs between servers. This solves the problem of data concentration to specific servers due to ongoing data changes and additions and allows efficient load balancing among servers. The performance results show that the proposed method significantly outperforms the existing partitioning methods in terms of query performance time in a distributed server.
Funder
National Research Foundation of Korea
Subject
Physics and Astronomy (miscellaneous),General Mathematics,Chemistry (miscellaneous),Computer Science (miscellaneous)
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献