Exact Graph Edit Distance Computation Using a Binary Linear Program
This paper presents a binary linear program which computes the exact graph edit distance between two richly attributed graphs (i.e. with attributes on both vertices and edges). Without solving graph edit distance for large graphs, the proposed program ena
- PDF / 196,001 Bytes
- 11 Pages / 439.37 x 666.142 pts Page_size
- 19 Downloads / 243 Views
Normandie Univ, UNIROUEN, UNIHAVRE, INSA Rouen, LITIS, 76000 Rouen, France [email protected] 2 LI Tours, Avenue Jean Portalis, Tours, France
Abstract. This paper presents a binary linear program which computes the exact graph edit distance between two richly attributed graphs (i.e. with attributes on both vertices and edges). Without solving graph edit distance for large graphs, the proposed program enables to process richer and larger graphs than existing approaches based on mathematical programming and the A∗ algorithm. Experiments are led on 7 standard graph datasets and the proposed approach is compared with two stateof-the-art algorithms. Keywords: Graph edit distance
1
· Binary linear program
Introduction
Computing the dissimilarity between two graphs is a crucial issue for graph-based pattern recognition problems, e.g. malware detection [10], chemoinformatics [5], or document analysis [2]. A large number of algorithms are proposed in the literature to compute graph dissimilarity. Among existing approaches, Graph Edit Distance (GED) has retained a lot of attention during the two last decades. Using GED, graph dissimilarity computation is directly linked to a matching process through the introduction of a set of graph edit operations (e.g. vertex insertion, vertex deletion). Each edit operation being characterized by a cost, the GED is the total cost of the least expensive sequence of edit operations that transforms one graph into the other one. A major theoretical advantage of GED is that it is a dissimilarity measure for arbitrarily structured and arbitrarily attributed graphs. Moreover, together with the dissimilarity, it provides the corresponding sequence of edit operations, which can be itself useful for application purpose. A limitation for the use of GED is its computational complexity. Indeed, as stated in [22], the GED problem is NP-hard. This explains why many recent contributions focus on the computation of GED approximations which can be applied on large graphs. Among existing approximations, some are based on the proposition of new heuristics to improve the performance of exact approaches [3,21] whereas others propose faster but suboptimal methods which approximate the c Springer International Publishing AG 2016 A. Robles-Kelly et al. (Eds.): S+SSPR 2016, LNCS 10029, pp. 485–495, 2016. DOI: 10.1007/978-3-319-49055-7 43
486
J. Lerouge et al.
exact GED (e.g. [1,4,14–16,18,20]). These latter are faster, but do not guarantee to find the optimal solution. In this paper, we are interested in the exact computation of the GED between two graphs, for a given set of edit costs. In this context, the main family of existing approaches is based on the widely known A∗ algorithm [3,21]. This algorithm relies on the exploration of the tree of solutions. In this tree, a node corresponds to a partial edition of the input graph towards the target one. A leaf of the tree corresponds to a complete edit path which transforms one graph into the other. The exploration of the tree is led by developing
Data Loading...