Improved Sparsity of Support Vector Machine with Robustness Towards Label Noise Based on Rescaled $$\alpha $$

  • PDF / 1,507,530 Bytes
  • 29 Pages / 439.37 x 666.142 pts Page_size
  • 113 Downloads / 164 Views

DOWNLOAD

REPORT


Improved Sparsity of Support Vector Machine with Robustness Towards Label Noise Based on Rescaled ˛-Hinge Loss with Non-smooth Regularizer Manisha Singla1 · Debdas Ghosh2

· K. K. Shukla1

Accepted: 1 September 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract As support vector machines (SVM) are used extensively in machine learning applications, it becomes essential to obtain a sparse model that is also robust to noise in the data set. Although many researchers have presented different approaches to get a robust SVM, the work on robust SVM based on rescaled hinge loss function (RSVM-RHHQ) has attracted a great deal of attention. The method of using correntropy with hinge loss function has added a noticeable amount of robustness to the model. However, the sparsity of the model can be further improved. In this work, we focus on enhancing the sparsity of RSVM-RHHQ. As this work is the improved version of the RSVM-RHHQ, we follow the same track (of adding noise in the data) of RSVM-RHHQ with altogether a new problem formulation. We apply correntropy to the α-hinge loss function, which results in a better loss function than the rescaled hinge loss function. We use a non-smooth regularizer with a non-convex and nonsmooth loss function. We solve this non-smooth and non-convex problem using the primal– dual proximal method. We find that this combination not only adds sparsity to the model, but it is also better than the existing robust SVM methods in terms of robustness towards label noise. We also provide the convergence proof of the proposed approach. In addition, the time complexity of the optimization technique is included. We perform experiments over various publicly available real-world data sets to compare the proposed method with the existing robust SVM methods. For experimentation purposes, we use small data sets, large data sets, and also data sets with significant class imbalance. Experimental results show that the proposed approach outperforms existing methods in sparseness, accuracy, and robustness. We also provide the sensitivity analysis of the regularization parameter for the label noise in the data set. Keywords Support vector machine · Robust statistics · Rescaled α-hinge loss · Non-smooth regularizer · Primal–dual proximal method · Sparsity

Extended author information available on the last page of the article

123

M. Singla et al.

1 Introduction SVM is a popular supervised learning model. SVM was first introduced as a maximum margin classifier [38] and described in detail in [39]. We can use SVM for both classification and regression problems. It works by finding a suitable hyperplane in the feature vector space, which is maximally separable from the classes when the training samples are separable. It formulates a problem of minimizing the norm of the weight vector corresponding to the hyperplane. However, if the training samples are not separable, a penalty term is added, which approximates the total loss. There are wide-ranging applications of SVM in the literatur