Support vector regression for polyhedral and missing data
- PDF / 1,090,784 Bytes
- 24 Pages / 439.37 x 666.142 pts Page_size
- 54 Downloads / 192 Views
Support vector regression for polyhedral and missing data Gianluca Gazzola1,2
· Myong K. Jeong1,3
Accepted: 10 September 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract We introduce “Polyhedral Support Vector Regression” (PSVR), a regression model for data represented by arbitrary convex polyhedral sets. PSVR is derived as a generalization of support vector regression, in which the data is represented by individual points along input variables X 1 , X 2 , . . ., X p and output variable Y , and extends a support vector classification model previously introduced for polyhedral data. PSVR is in essence a robust-optimization model, which defines prediction error as the largest deviation, calculated along Y , between an interpolating hyperplane and all points within a convex polyhedron; the model relies on the affine Farkas’ lemma to make this definition computationally tractable within the formulation. As an application, we consider the problem of regression with missing data, where we use convex polyhedra to model the multivariate uncertainty involving the unobserved values in a data set. For this purpose, we discuss a novel technique that builds on multiple imputation and principal component analysis to estimate convex polyhedra from missing data, and on a geometric characterization of such polyhedra to define observation-specific hyper-parameters in the PSVR model. We show that an appropriate calibration of such hyper-parameters can have a significantly beneficial impact on the model’s performance. Experiments on both synthetic and real-world data illustrate how PSVR performs competitively or better than other benchmark methods, especially on data sets with high degree of missingness. Keywords Regression · Uncertainty · Missing data · Convex polyhedron · Farkas’ lemma
1 Introduction Support vector regression (SVR) is a supervised learning method for the estimation of an unknown function from a data set of observations, each of which represented by a point with
B
Myong K. Jeong [email protected] Gianluca Gazzola [email protected]
1
Rutgers Center for Operations Research, Department of Management Science and Information Systems, Rutgers University, 100 Rockafeller Road, Piscataway, NJ 08854, USA
2
Bridge Intelligence LLC, 1215 Livingston Ave Suite 208, North Brunswick, NJ 08902, USA
3
Department of Industrial and Systems Engineering, Rutgers University, 96 Frelinghuysen Road, Piscataway, NJ 08854, USA
123
Annals of Operations Research
multiple input values and one output value (Hastie et al. 2009). Such estimation is carried out via a hyperplane, which is optimally fit on the data set, possibly after a non-linear transformation of the input values by means of a kernel function (Vapnik 1995; Drucker et al. 1997; Smola and Scholkopf 2004). Its remarkable performance as a predictive model has gained SVR considerable popularity in a variety of fields of application, including finance (Yang et al. 2002), transportation (Wu et al. 2004), genetics (Myasnikova et al.
Data Loading...