Gradient descent algorithms for quantile regression with smooth approximation

PDF / 840,714 Bytes
17 Pages / 595.276 x 790.866 pts Page_size
5 Downloads / 296 Views

ORIGINAL ARTICLE

Gradient descent algorithms for quantile regression with smooth approximation Songfeng Zheng

Received: 22 April 2011 / Accepted: 23 June 2011 / Published online: 22 July 2011 Ó Springer-Verlag 2011

Abstract Gradient based optimization methods often converge quickly to a local optimum. However, the check loss function used by quantile regression model is not everywhere differentiable, which prevents the gradient based optimization methods from being applicable. As such, this paper introduces a smooth function to approximate the check loss function so that the gradient based optimization methods could be employed for fitting quantile regression model. The properties of the smooth approximation are discussed. Two algorithms are proposed for minimizing the smoothed objective function. The first method directly applies gradient descent, resulting the gradient descent smooth quantile regression model; the second approach minimizes the smoothed objective function in the framework of functional gradient descent by changing the fitted model along the negative gradient direction in each iteration, which yields boosted smooth quantile regression algorithm. Extensive experiments on simulated data and real-world data show that, compared to alternative quantile regression models, the proposed smooth quantile regression algorithms can achieve higher prediction accuracy and are more efficient in removing noninformative predictors. Keywords Quantile regression Gradient descent Boosting Variable selection 1 Introduction The ordinary least square regression aims to estimate the conditional expectation of the response Y given the S. Zheng (&) Department of Mathematics, Missouri State University, Springfield, MO 65897, USA e-mail: [email protected]

predictor (vector) x; i.e., EðYjxÞ: However, the mean value (or the conditional expectation) is sensitive to the outliers of the data [14]. Therefore, if the data is not homogeneously distributed, we expect the least square regression giving us a poor prediction. The sth quantile of a distribution is defined as the value such that there is 100s% of mass on its left side. Compared to the mean value, quantiles are more robust to outliers [14]. Another advantage of quantile is that we can get a series of quantile values which can describe the whole data distribution better than a single value (e.g., mean) does. Let Qs ðYÞ be the sth quantile of a random variable Y, it can be proved [12] that Qs ðYÞ ¼ arg min EY ½qs ðY cÞ; c

where qs ðrÞ is the ‘‘check function’’ [14] defined by qs ðrÞ ¼ rIðr 0Þ ð1 sÞr:

ð1Þ

The function IðÞ in Eq. 1 is the indicator function with IðÞ ¼ 1 if the condition is true, otherwise IðÞ ¼ 0: Given data fðxi ; Yi Þ; i ¼ 1; . . .; ng; with predictor vector xi 2 Rp and response Y i 2 R; let qðxÞ be the sth conditional quantile of Y given x: Similar to the least square regression, quantile regression (QReg) [14] aims at estimating the conditional sth quantile of the response given predictor vector x and can be formulated a

Data Loading...

Gradient descent algorithms for quantile regression with smooth approximation

Recommend Documents

Local Attractors for Gradient-Related Descent Iterations

Economic Applications of Quantile Regression

Additive models for extremal quantile regression with Pareto-type distributions

Fast gradient descent algorithm for image classification with neural networks

Quantile Regression Hindsight Experience Replay

Linear convergence of inexact descent method and inexact proximal gradient algorithms for lower-order regularization pro

Approximation Algorithms

Approximation Algorithms

Smooth Minimization of Nonsmooth Functions with Parallel Coordinate Descent Methods

Broad Learning System with Proportional-Integral-Differential Gradient Descent

Single-View 3D Shape Reconstruction with Learned Gradient Descent

Approximation Algorithms for Vertex Happiness