Gradient descent algorithms for quantile regression with smooth approximation

  • PDF / 840,714 Bytes
  • 17 Pages / 595.276 x 790.866 pts Page_size
  • 5 Downloads / 270 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

Gradient descent algorithms for quantile regression with smooth approximation Songfeng Zheng

Received: 22 April 2011 / Accepted: 23 June 2011 / Published online: 22 July 2011 Ó Springer-Verlag 2011

Abstract Gradient based optimization methods often converge quickly to a local optimum. However, the check loss function used by quantile regression model is not everywhere differentiable, which prevents the gradient based optimization methods from being applicable. As such, this paper introduces a smooth function to approximate the check loss function so that the gradient based optimization methods could be employed for fitting quantile regression model. The properties of the smooth approximation are discussed. Two algorithms are proposed for minimizing the smoothed objective function. The first method directly applies gradient descent, resulting the gradient descent smooth quantile regression model; the second approach minimizes the smoothed objective function in the framework of functional gradient descent by changing the fitted model along the negative gradient direction in each iteration, which yields boosted smooth quantile regression algorithm. Extensive experiments on simulated data and real-world data show that, compared to alternative quantile regression models, the proposed smooth quantile regression algorithms can achieve higher prediction accuracy and are more efficient in removing noninformative predictors. Keywords Quantile regression  Gradient descent  Boosting  Variable selection 1 Introduction The ordinary least square regression aims to estimate the conditional expectation of the response Y given the S. Zheng (&) Department of Mathematics, Missouri State University, Springfield, MO 65897, USA e-mail: [email protected]

predictor (vector) x; i.e., EðYjxÞ: However, the mean value (or the conditional expectation) is sensitive to the outliers of the data [14]. Therefore, if the data is not homogeneously distributed, we expect the least square regression giving us a poor prediction. The sth quantile of a distribution is defined as the value such that there is 100s% of mass on its left side. Compared to the mean value, quantiles are more robust to outliers [14]. Another advantage of quantile is that we can get a series of quantile values which can describe the whole data distribution better than a single value (e.g., mean) does. Let Qs ðYÞ be the sth quantile of a random variable Y, it can be proved [12] that Qs ðYÞ ¼ arg min EY ½qs ðY  cÞ; c

where qs ðrÞ is the ‘‘check function’’ [14] defined by qs ðrÞ ¼ rIðr  0Þ  ð1  sÞr:

ð1Þ

The function IðÞ in Eq. 1 is the indicator function with IðÞ ¼ 1 if the condition is true, otherwise IðÞ ¼ 0: Given data fðxi ; Yi Þ; i ¼ 1; . . .; ng; with predictor vector xi 2 Rp and response Y i 2 R; let qðxÞ be the sth conditional quantile of Y given x: Similar to the least square regression, quantile regression (QReg) [14] aims at estimating the conditional sth quantile of the response given predictor vector x and can be formulated a