Ensemble Approaches of Support Vector Machines for Multiclass Classification
Support vector machine (SVM) which was originally designed for binary classification has achieved superior performance in various classification problems. In order to extend it to multiclass classification, one popular approach is to consider the problem
- PDF / 801,145 Bytes
- 10 Pages / 430 x 660 pts Page_size
- 24 Downloads / 243 Views
Abstract. Support vector machine (SVM) which was originally designed for binary classification has achieved superior performance in various classification problems. In order to extend it to multiclass classification, one popular approach is to consider the problem as a collection of binary classification problems. Majority voting or winner-takes-all is then applied to combine those outputs, but it often causes problems to consider tie-breaks and tune the weights of individual classifiers. This paper presents two novel ensemble approaches: probabilistic ordering of one-vs-rest (OVR) SVMs with naïve Bayes classifier and multiple decision templates of OVR SVMs. Experiments with multiclass datasets have shown the usefulness of the ensemble methods. Keywords: Support vector machines; Ensemble; Naïve Bayes; Multiple decision templates; Cancer classification, Fingerprint classification.
1 Introduction Support Vector Machine (SVM) is a relatively new learning method which shows excellent performance in many pattern recognition applications [1]. It maps an input sample into a high dimensional feature space and tries to find an optimal hyperplane that minimizes the recognition error for the training data by using the non-linear transformation function [2]. Since the SVM was originally designed for binary classification, it is required to devise a multiclass SVM method [3]. Basically, there are two different trends for extending SVMs to multiclass problems. The first considers the multiclass problem directly as a generalization of the binary classification scheme [4]. This approach often leads to a complex optimization problem. The second decomposes a multiclass problem into multiple binary classification problems that can be solved by an SVM [5]. One-vs-rest (OVR) is a representative decomposition strategy, while winner-takes-all or error correcting codes (ECCs) is reconstruction schemes to combine the multiple outputs [6]. It has been pointed out that there is no guarantee on the decomposition-based approach to reach the optimal separation of samples [7]. There are several reasons for this, such as unbalanced sample sizes. However, these could be complemented by the proper selection of a model or a decision scheme. A. Ghosh, R.K. De, and S.K. Pal (Eds.): PReMI 2007, LNCS 4815, pp. 1–10, 2007. © Springer-Verlag Berlin Heidelberg 2007
2
J.-K. Min, J.-H. Hong, and S.-B. Cho
In this paper, we present two novel ensemble approaches for applying OVR SVMs to multiclass classification. The first orders OVR SVMs probabilistically by using naïve Bayes (NB) classifier with respect to the subsumption architecture [8]. The latter uses multiple decision templates (MuDTs) of OVR SVMs [9]. It organizes the outputs of the SVMs with a decision profile as a matrix, and estimates the localized template from the profiles of the training set by using the K-means clustering algorithm. The profile of a test sample is then matched to the templates by a similarity measure. The approaches have been validated on the GCM cancer data [10] and the NI
Data Loading...