Learning Diverse Models: The Coulomb Structured Support Vector Machine

In structured prediction, it is standard procedure to discriminatively train a single model that is then used to make a single prediction for each input. This practice is simple but risky in many ways. For instance, models are often designed with tractabi

PDF / 1,223,064 Bytes
15 Pages / 439.37 x 666.142 pts Page_size
2 Downloads / 269 Views

DOWNLOAD

REPORT

University of Heidelberg, IWR/HCI, 69120 Heidelberg, Germany {martin.schiegg,ferran.diego,fred.hamprecht}@iwr.uni-heidelberg.de 2 Robert Bosch GmbH, 70465 Stuttgart, Germany

Abstract. In structured prediction, it is standard procedure to discriminatively train a single model that is then used to make a single prediction for each input. This practice is simple but risky in many ways. For instance, models are often designed with tractability rather than faithfulness in mind. To hedge against such model misspeciﬁcation, it may be useful to train multiple models that all are a reasonable ﬁt to the training data, but at least one of which may hopefully make more valid predictions than the single model in standard procedure. We propose the Coulomb Structured SVM (CSSVM) as a means to obtain at training time a full ensemble of diﬀerent models. At test time, these models can run in parallel and independently to make diverse predictions. We demonstrate on challenging tasks from computer vision that some of these diverse predictions have signiﬁcantly lower task loss than that of a single model, and improve over state-of-the-art diversity encouraging approaches. Keywords: Structured output learning · Diverse predictions · Multiple output learning · Structured support vector machine

1

Introduction

The success of large margin methods for structured output learning, such as the structured support vector machine (SSVM) [1], is partly due to their good generalization performances achieved on test data, compared to, e.g. maximum likelihood learning on structured models [2]. Despite such regularization strategies, however, it is not guaranteed that the model which optimizes the learning objective function really generalizes well to unseen data. Reasons include wrong model assumptions, noisy data, ambiguities in the data, missing features, insufﬁcient training data, or a task loss which is too complex to model directly. To further decrease the generalization error, it is beneﬁcial to either (i) generate multiple likely solutions from the model [3–5] or, (ii) learn multiple models which generate diverse predictions [6–8]. The diﬀerent predictions for a given structured input may then be analyzed to compute robustness/uncertainty measures, or may be the input for a more complex model exploiting higher-order c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part III, LNCS 9907, pp. 585–599, 2016. DOI: 10.1007/978-3-319-46487-9 36

586

M. Schiegg et al.

Fig. 1. Structured SVM learning. “+” indicates a structured training example whereas “−” in the same color are the corresponding structured outputs with task loss Δ(+, −) > 0. (a) A standard linear SSVM maximizes the margin between positive and all “negative” examples (decision boundary with its normal vector in cyan). (b) Multiple choice learning [6] learns M SSVMs (here: 3) which cluster the space (clusters for positive and negative examples are depicted in the same color) to generate M outputs. (c) We propose the Coulomb Structured SVM which learns

Data Loading...

Learning Diverse Models: The Coulomb Structured Support Vector Machine

Recommend Documents

Segmentation of handwritten words using structured support vector machine

Support Vector Machine

Support Vector Machine

Support Vector Machine

Facial Spoof Detection Using Support Vector Machine

Flooding Identification by Support Vector Machine

If-SVM: Iterative factoring support vector machine

Support vector machine based machine learning method for GS 8QAM constellation classification in seamless integrated fib

Evaluating Machine Learning Models

Machine Learning Models

Deploying Machine Learning Models

Predictive Models for Permeability of Cracked Rock Masses Based on Support Vector Machine Techniques