Convex graph invariant relaxations for graph edit distance

  • PDF / 1,395,216 Bytes
  • 35 Pages / 439.37 x 666.142 pts Page_size
  • 14 Downloads / 250 Views

DOWNLOAD

REPORT


Series A

Convex graph invariant relaxations for graph edit distance Utkan Onur Candogan1

· Venkat Chandrasekaran2

Received: 17 April 2019 / Accepted: 4 September 2020 © Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society 2020

Abstract The edit distance between two graphs is a widely used measure of similarity that evaluates the smallest number of vertex and edge deletions/insertions required to transform one graph to another. It is NP-hard to compute in general, and a large number of heuristics have been proposed for approximating this quantity. With few exceptions, these methods generally provide upper bounds on the edit distance between two graphs. In this paper, we propose a new family of computationally tractable convex relaxations for obtaining lower bounds on graph edit distance. These relaxations can be tailored to the structural properties of the particular graphs via convex graph invariants. Specific examples that we highlight in this paper include constraints on the graph spectrum as well as (tractable approximations of) the stability number and the maximum-cut values of graphs. We prove under suitable conditions that our relaxations are tight (i.e., exactly compute the graph edit distance) when one of the graphs consists of few eigenvalues. We also validate the utility of our framework on synthetic problems as well as real applications involving molecular structure comparison problems in chemistry. Keywords Convex optimization · Majorization · Maximum cut · Semidefinite programming · Stability number · Strongly regular graphs Mathematics Subject Classification 90C25 · 90C22 · 90C90 · 90C35

The authors were supported in part by NSF Grants CCF-1350590 and CCF-1637598, by AFOSR Grant FA9550-16-1-0210, and by a Sloan research fellowship.

B

Utkan Onur Candogan [email protected] Venkat Chandrasekaran [email protected]

1

Department of Electrical Engineering, California Institute of Technology, Pasadena, CA 91125, USA

2

Departments of Computing and Mathematical Sciences and of Electrical Engineering, California Institute of Technology, Pasadena, CA 91125, USA

123

U. O. Candogan, V. Chandrasekaran

1 Introduction Graphs are widely used to represent the structure underlying a collection of interacting entities. A common computational question arising in many contexts is that of measuring the similarity between two graphs. For example, the unknown functions of biological structures such as proteins, RNAs and genes are often deduced from structures which have similar sequences with known functions [23,25,31,41,42]. Evaluating graph similarity also plays a central role in various pattern recognition applications [12,35], specifically in areas such as handwriting recognition [17,30], fingerprint classification [24,34] and face recognition [44]. The notion of similarity that is the most commonly considered is the graph edit distance [39]. The edit distance GED(G1 , G2 ) between two graphs G1 and G2 is the smallest number of operations required to transform G1 into