Abstract
Graph edit distance (GED) is a powerful tool to model the dissimilarity between graphs. However, evaluating the exact GED is NP-hard. To tackle this problem, estimation methods of GED were introduced, e.g., bipartite and IPFP, during which heuristics were employed. The stochastic nature of these methods induces the stability issue. In this paper, we propose the first formal study of stability of GED heuristics, starting with defining a measure of these (in)stabilities, namely the relative error. Then, the effects of two critical factors on stability are examined, namely, the number of solutions and the ratio between edit costs. The ratios are computed on five datasets of various properties. General suggestions are provided to properly choose these factors, which can reduce the relative error by more than an order of magnitude. Finally, we verify the relevance of stability to predict performance of GED heuristics, by taking advantage of an edit cost learning algorithm to optimize the performance and the k-nearest neighbor regression for prediction. Experiments show that the optimized costs correspond to much higher ratios and an order of magnitude lower relative errors than the expert cost.
Funder
Agence Nationale de la Recherche
China Scholarship Council
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference38 articles.
1. Bridging Graph and Kernel Spaces: A Pre-Image Perspective;Jia;Ph.D. Thesis,2021
2. Graph Kernels: State-of-the-Art and Future Challenges;Borgwardt;arXiv,2020
3. Graph embedding techniques, applications, and performance: A survey
4. A survey on graph kernels
5. Graph kernels in chemoinformatics;Gaüzère,2015