Multi-Source Information Graph Embedding with Ensemble Learning for Link Prediction
-
Published:2024-07-13
Issue:14
Volume:13
Page:2762
-
ISSN:2079-9292
-
Container-title:Electronics
-
language:en
-
Short-container-title:Electronics
Author:
Hou Chunning1ORCID, Wang Xinzhi1ORCID, Luo Xiangfeng1ORCID, Xie Shaorong1ORCID
Affiliation:
1. School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
Abstract
Link prediction is a key technique for connecting entities and relationships in a graph reasoning field. It leverages known information about the graph structure data to predict missing factual information. Previous studies have either focused on the semantic representation of a single triplet or on the graph structure data built on triples. The former ignores the association between different triples, and the latter ignores the true meaning of the node itself. Furthermore, common graph-structured datasets inherently face challenges, such as missing information and incompleteness. In light of this challenge, we present a novel model called Multi-source Information Graph Embedding with Ensemble Learning for Link Prediction (EMGE), which can effectively improve the reasoning of link prediction. Ensemble learning is systematically applied throughout the model training process. At the data level, this approach enhances entity embeddings by integrating structured graph information and unstructured textual data as multi-source information inputs. The fusion of these inputs is effectively addressed by introducing an attention mechanism. During the training phase, the principle of ensemble learning is employed to extract semantic features from multiple neural network models, facilitating the interaction of enriched information. To ensure effective model learning, a novel loss function based on contrastive learning is devised, effectively minimizing the discrepancy between predicted values and the ground truth. Moreover, to enhance the semantic representation of graph nodes in link prediction, two rules are introduced during the aggregation of graph structure information. These rules incorporate the concept of spreading activation, enabling a more comprehensive understanding of the relationships between nodes and edges in the graph. During the testing phase, the EMGE model is validated on three datasets, including WN18RR, FB15k-237, and a private Chinese financial dataset. The experimental results demonstrate a reduction in the mean rank (MR) by 0.2 times, an improvement in the mean reciprocal rank (MRR) by 5.9%, and an increase in the Hit@1 by 12.9% compared to the baseline model.
Funder
National Key Research and Development Program of China the Outstanding Academic Leader Project of Shanghai National Natural Science Foundation of China
Reference42 articles.
1. Hou, R., Zhang, Y., Ou, Q., Li, S., He, Y., Wang, H., and Zhou, Z. (2023). Recommendation Method of Power Knowledge Retrieval Based on Graph Neural Network. Electronics, 12. 2. Lee, S., Ahn, J., and Kim, N. (2024). Embedding Enhancement Method for LightGCN in Recommendation Information Systems. Electronics, 13. 3. Liu, Y., Zhang, H., Zong, T., Wu, J., and Dai, W. (2023). Knowledge Base Question Answering via Semantic Analysis. Electronics, 12. 4. Wang, P., Liu, J., Zhong, X., and Zhou, S. (2023). A Cybersecurity Knowledge Graph Completion Method for Penetration Testing. Electronics, 12. 5. Zhang, L., Wang, J., Wang, W., Jin, Z., Zhao, C., Cai, Z., and Chen, H. (2022). A novel smart contract vulnerability detection method based on information graph and ensemble learning. Sensors, 22.
|
|