Improved Prototype Embedding Based Generalized Median Computation by Means of Refined Reconstruction Methods

Learning a prototype from a set of given objects is a core problem in machine learning and pattern recognition. A popular approach to consensus learning is to formulate it as an optimization problem in terms of generalized median computation. Recently, a

  • PDF / 331,713 Bytes
  • 11 Pages / 439.37 x 666.142 pts Page_size
  • 97 Downloads / 179 Views

DOWNLOAD

REPORT


bstract. Learning a prototype from a set of given objects is a core problem in machine learning and pattern recognition. A popular approach to consensus learning is to formulate it as an optimization problem in terms of generalized median computation. Recently, a prototypeembedding approach has been proposed to transform the objects into a vector space, compute the geometric median, and then inversely transform back into the original space. This approach has been successfully applied in several domains, where the generalized median problem has inherent high computational complexity (typically N P-hard) and thus approximate solutions are required. In this work we introduce three new methods for the inverse transformation. We show that these methods significantly improve the generalized median computation compared to previous methods.

1

Introduction

Learning a prototype from a set of given objects is a core problem in machine learning and pattern recognition, and has numerous applications [4,6,10]. One often needs a representation of several similar objects by a single consensus object. One example is multiple classifier combination for text recognition, where a change in algorithm parameters or the use of different algorithms can lead to distinct results, each with small errors. Consensus methods produce a text which best represents the different results and thus removes errors and outliers. A popular approach to consensus learning is to formulate it as an optimization problem in terms of generalized median computation. Given a set of objects O = {o1 , . . . , on } in domain O with a distance function δ(oi , oj ), the generalized median can be expressed as o¯ = arg min SOD(o) o∈O

where SOD is the sum of distances SOD(o) =



δ(o, p).

p∈O c Springer International Publishing AG 2016  A. Robles-Kelly et al. (Eds.): S+SSPR 2016, LNCS 10029, pp. 107–117, 2016. DOI: 10.1007/978-3-319-49055-7 10

(1)

108

A. Nienk¨ otter and X. Jiang

In other words, the generalized median is an object which has the smallest sum of distances to all input objects. Note that the median object is not necessarily part of set O. The concept of generalized median has been studied for numerous problem domains related to a broad range of applications, for instance strings [10] and graphs [9]. As the generalized median is not necessary part of the set, any algorithm must be of constructive nature and the construction process crucially depends on the structure of the objects under consideration. In addition, the generalized median computation is provably of high computational complexity in many cases. For instance, the computation of generalized median string turns out to be N P-hard [8] for the string edit distance. The same applies to median ranking under the generalized Kendall-τ distance [3] and ensemble clustering for reasonable clustering distance functions, e.g. the Mirkin-metric [11]. Given the high computational complexity, approximate solutions are required to calculate the generalized median in reasonable time. Recently, a prototype e