Graph Stream Compression Scheme Based on Pattern Dictionary Using Provenance
-
Published:2024-05-25
Issue:11
Volume:14
Page:4553
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
Lee Hyeonbyeong1ORCID, Shin Bokyoung1, Choi Dojin2ORCID, Lim Jongtae1ORCID, Bok Kyoungsoo3ORCID, Yoo Jaesoo1ORCID
Affiliation:
1. Department of Information and Communication Engineering, Chungbuk National University, Chung-dae-ro 1, Seowon-gu, Cheongju 28644, Chungcheongbuk-do, Republic of Korea 2. Department of Computer Engineering, Changwon National University, Changwondaehak-ro 20, Uichang-gu, Changwon-si 51140, Gyeongsangnam-do, Republic of Korea 3. Department of Artificial Intelligence Convergence, Wonkwang University, Iksandae 460, Iksan 54538, Jeollabuk-do, Republic of Korea
Abstract
With recent advancements in network technology and the increasing popularity of the internet, the use of social network services and Internet of Things devices has flourished, leading to a continuous generation of large volumes of graph stream data, where changes, such as additions or deletions of vertices and edges, occur over time. Additionally, owing to the need for the efficient use of storage space and security requirements, graph stream data compression has become essential in various applications. Even though various studies on graph compression methods have been conducted, most of them do not fully reflect the dynamic characteristics of graph streams and the complexity of large graphs. In this paper, we propose a compression scheme using provenance data to efficiently process and analyze large graph stream data. It obtains provenance data by analyzing graph stream data and builds a pattern dictionary based on this to perform dictionary-based compression. By improving the existing dictionary-based graph compression methods, it enables more efficient dictionary management through tracking pattern changes and evaluating their importance using provenance. Furthermore, it considers the relationships among sub-patterns using an FP-tree and performs pattern dictionary management that updates pattern scores based on time. Our experiments show that the proposed scheme outperforms existing graph compression methods in key performance metrics, such as compression rate and processing time.
Funder
National Research Foundation of Korea MSIT Rural Development Administration
Reference33 articles.
1. Song, J., Yi, Q., Gao, H., Wang, B., and Kong, X. (2023). Exploring Prior Knowledge from Human Mobility Patterns for POI Recommendation. Appl. Sci., 13. 2. Kouahla, Z., Benrazek, A.-E., Ferrag, M.A., Farou, B., Seridi, H., Kurulay, M., Anjum, A., and Asheralieva, A. (2021). A Survey on Big IoT Data Indexing: Potential Solutions, Recent Advancements, and Open Issues. Future Internet, 14. 3. Substructure Discovery Using Minimum Description Length and Background Knowledge;Cook;J. Artif. Intell. Res.,1993 4. Wang, G., Ai, J., Mo, L., Yi, X., Wu, P., Wu, X., and Kong, L. (2023). Anomaly Detection for Data from Unmanned Systems via Improved Graph Neural Networks with Attention Mechanism. Drones, 7. 5. Henecka, W., and Roughan, M. (2015, January 24–26). Lossy Compression of Dynamic, Weighted Graphs. Proceedings of the 2015 3rd International Conference on Future Internet of Things and Cloud, Rome, Italy.
|
|