Partial Linearization Based Optimization for Multi-class SVM
We propose a novel partial linearization based approach for optimizing the multi-class svm learning problem. Our method is an intuitive generalization of the Frank-Wolfe and the exponentiated gradient algorithms. In particular, it allows us to combine sev
- PDF / 909,743 Bytes
- 16 Pages / 439.37 x 666.142 pts Page_size
- 90 Downloads / 192 Views
2
IIIT-Hyderabad, Hyderabad, India [email protected] CentraleSupelec and Inria Saclay, Palaiseau, France 3 University of Oxford, Oxford, UK
Abstract. We propose a novel partial linearization based approach for optimizing the multi-class svm learning problem. Our method is an intuitive generalization of the Frank-Wolfe and the exponentiated gradient algorithms. In particular, it allows us to combine several of their desirable qualities into one approach: (i) the use of an expectation oracle (which provides the marginals over each output class) in order to estimate an informative descent direction, similar to exponentiated gradient; (ii) analytical computation of the optimal step-size in the descent direction that guarantees an increase in the dual objective, similar to Frank-Wolfe; and (iii) a block coordinate formulation similar to the one proposed for Frank-Wolfe, which allows us to solve large-scale problems. Using the challenging computer vision problems of action classification, object recognition and gesture recognition, we demonstrate the efficacy of our approach on training multi-class svms with standard, publicly available, machine learning datasets. Keywords: Multi-class svm, Partial linearization, Optimization
1
Introduction
Many tasks in computer vision can be formulated as multi-class classification problems. In other words, given an image or a video, the task is to assign it a label that belongs to a specified finite set. For example, in the case of object recognition from an image, the label can be car, chair or person. Similarly, for action recognition from a video, actions categories like jumping, kicking or clapping can be candidate labels. There has been extensive research in the area of multi-class classification with a plethora of solutions being proposed [2,4,16,18]. In this work, we focus on multi-class svm (mc-svm), which is one of the most popular methods for this task. The mc-svm model provides a linear function that Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46454-1 51) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part V, LNCS 9909, pp. 842–857, 2016. DOI: 10.1007/978-3-319-46454-1 51
Partial Linearization Based Optimization for Multi-class SVM
843
gives a score for each class. Given a test sample, its class is predicted by maximizing the score. During learning, the mc-svm objective minimizes an upper bound on the empirical risk of the training data, for which we know the ground-truth labels. The risk is typically measured by the standard 0–1 loss function. However, any other loss function can be easily substituted into the mc-svm learning framework. The size of the mc-svm learning problem rapidly increases with the number of classes and size of the training dataset. In order to enable the use of mcsvm with large scale problems, several optimization algorithms for minimizing its learning objective have been prop
Data Loading...