Abstract
AbstractBig graphs are part of the movement of “Not Only SQL” databases (also called NoSQL) focusing on the relationships between data, rather than the values themselves. The data is stored in vertices while the edges model the interactions or relationships between these data. They offer flexibility in handling data that is strongly connected to each other. The analysis of a big graph generally involves exploring all of its vertices. Thus, this operation is costly in time and resources because big graphs are generally composed of millions of vertices connected through billions of edges. Consequently, the graph algorithms are expansive compared to the size of the big graph, and are therefore ineffective for data exploration. Thus, partitioning the graph stands out as an efficient and less expensive alternative for exploring a big graph. This technique consists in partitioning the graph into a set of k sub-graphs in order to reduce the complexity of the queries. Nevertheless, it presents many challenges because it is an NP-complete problem. In this article, we present DPHV (Distributed Placement of Hub-Vertices) an efficient parallel and distributed heuristic for large-scale graph partitioning. An application on a real-world graphs demonstrates the feasibility and reliability of our method. The experiments carried on a 10-nodes Spark cluster proved that the proposed methodology achieves significant gain in term of time and outperforms JA-BE-JA, Greedy, DFEP.
Publisher
Springer Science and Business Media LLC
Subject
Information Systems and Management,Computer Networks and Communications,Hardware and Architecture,Information Systems
Reference45 articles.
1. Danai K, Christos F. Individual and collective graph mining: principles, algorithms, and applications. Synth Lect Data Mining Knowl Discov. 2017;9:2.
2. Yoon B, Kim S, Kim S. Use of graph database for the integration of heterogeneous biological data. Genomics Inf. 2017;15(1):19–27.
3. Aridhi S, Nguifo EM. Big graph mining: frameworks and techniques. Big Data Res. 2016;6:1–10.
4. Jiang M, Cui P, Beutel A, Faloutsos C, Yang S. Catching synchronized behaviors in large networks: a graph mining approach. ACM Trans Knowl Discov Data. 2016;10(4):1–27.
5. Alekseev VE, Boliac R, Korobitsyn DV, Lozin VV. NP-hard graph problems and boundary classes of graphs. Theor Comput Sci. 2007;389(1):219–36.
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献