Approximation of Graph Edit Distance by Means of a Utility Matrix

Graph edit distance is one of the most popular graph matching paradigms available. By means of a reformulation of graph edit distance to an instance of a linear sum assignment problem, the major drawback of this dissimilarity model, viz. the exponential t

  • PDF / 208,269 Bytes
  • 10 Pages / 439.37 x 666.142 pts Page_size
  • 95 Downloads / 233 Views

DOWNLOAD

REPORT


3

Institute for Information Systems, University of Applied Sciences FHNW, Riggenbachstrasse 16, 4600 Olten, Switzerland [email protected] 2 Department of Informatics, University of Fribourg and HES-SO, 1700 Fribourg, Switzerland [email protected] Institute of Computer Science and Applied Mathematics, University of Bern, Neubr¨ uckstrasse 10, 3012 Bern, Switzerland [email protected]

Abstract. Graph edit distance is one of the most popular graph matching paradigms available. By means of a reformulation of graph edit distance to an instance of a linear sum assignment problem, the major drawback of this dissimilarity model, viz. the exponential time complexity, has been invalidated recently. Yet, the substantial decrease of the computation time is at the expense of an approximation error. The present paper introduces a novel transformation that processes the underlying cost model into a utility model. The benefit of this transformation is that it enables the integration of additional information in the assignment process. We empirically confirm the positive effects of this transformation on three standard graph data sets. That is, we show that the accuracy of a distance based classifier can be improved with the proposed transformation while the run time remains nearly unaffected.

1

Introduction

Graphs are recognized as versatile alternative to feature vectors and thus, they found widespread application in pattern recognition and related fields [1,2]. However, one drawback of graphs, when compared to feature vectors, is the significant increase of the complexity of many algorithms. Regard, for instance, the algorithmic comparison of two patterns (which is actually a basic requirement for pattern recognition). Due to the homogeneous nature of feature vectors, pairwise comparisons is straightforward and can be accomplished in linear time with respect to the length of the two vectors. Yet, the same task for graphs, commonly referred to as graph matching, is much more complex, as one has to identify common parts of the graphs by considering all of their subsets of nodes. Regarding that there are O(2n ) subsets of nodes in a graph with n nodes, the inherent difficulty of graph matching becomes obvious. In the last four decades a huge number of procedures for graph matching have been proposed in the literature [1,2]. They range from spectral methods [3,4], over c Springer International Publishing AG 2016  F. Schwenker et al. (Eds.): ANNPR 2016, LNAI 9896, pp. 185–194, 2016. DOI: 10.1007/978-3-319-46182-3 16

186

K. Riesen et al.

graph kernels [5,6], to reformulations of the discrete graph matching problem to an instance of a continuous optimization problem (basically by relaxing some constraints) [7]. Graph edit distance [8,9], introduced about 30 years ago, is still one of the most flexible graph distance models available and topic of various recent research projects. In order to compute the graph edit distance often A* based search techniques using some heuristics are employed (e.g. [10]). Yet, exact graph edit distanc