Vector quantile regression and optimal transport, from theory to numerics
- PDF / 1,295,874 Bytes
- 28 Pages / 439.37 x 666.142 pts Page_size
- 77 Downloads / 217 Views
Vector quantile regression and optimal transport, from theory to numerics Guillaume Carlier1,2 · Victor Chernozhukov3 · Gwendoline De Bie4 · Alfred Galichon5 Received: 1 August 2019 / Accepted: 26 July 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020, corrected publication 2020
Abstract In this paper, we first revisit the Koenker and Bassett variational approach to (univariate) quantile regression, emphasizing its link with latent factor representations and correlation maximization problems. We then review the multivariate extension due to Carlier et al. (Ann Statist 44(3):1165–92, 2016,; J Multivariate Anal 161:96–102, 2017) which relates vector quantile regression to an optimal transport problem with mean independence constraints. We introduce an entropic regularization of this problem, implement a gradient descent numerical method and illustrate its feasibility on univariate and bivariate examples. Keywords Vector quantile regression · Optimal transport with mean independence constraints · Latent factors · Entropic regularization JEL Classification C51 · C60
B
Alfred Galichon [email protected] Guillaume Carlier [email protected] Victor Chernozhukov [email protected] Gwendoline De Bie [email protected]
1
CEREMADE, UMR CNRS 7534, PSL, Université Paris IX Dauphine, Pl. de Lattre de Tassigny, 75775 Paris Cedex 16, France
2
MOKAPLAN Inria, Paris, France
3
Department of Economics, MIT, 50 Memorial Drive, E52-361B, Cambridge, MA 02142, USA
4
DMA, ENS, Paris, France
5
Economics and Mathematics Departments, New York University, 70 Washington Square South, New York, NY 10013, USA
123
G. Carlier et al.
1 Introduction Quantile regression, introduced by Koenker and Bassett Jr (1978), has become a very popular tool for analyzing the response of the whole distribution of a dependent variable to a set of predictors. It is a far-reaching generalization of the median regression, allowing for a predition of any quantile of the distribution. We briefly recall classical quantile regression. For t ∈ [0, 1], it is well-known that the t -quantile of + − |X , or + − t) ε the loss function E tε ε = Y − qt (x)given X = x minimizes (1 equivalently E ε+ + (t − 1) ε|X . As a result, if qt (x) is specified under the parametric form qt (x) = βt x + αt , it is natural to estimate αt and βt by minimizing the loss + min E Y − β X − α + (1 − t) β X + α . α,β
While the previous optimization problem estimates αt and βt for pointwise values of t, if one would like to estimate the whole curve t → (αt , βt ), one simply should construct the loss function by integrating the previous loss functions over t ∈ [0, 1], and thus the curve t → (αt , βt ) minimizes min
1
(αt ,βt )t∈[0,1] 0
E
Y − βt X − αt
+
+ (1 − t) βt X + αt dt.
As it is known since the original work by Koenker and Bassett, this problem has an (infinite-dimensional) linear programming formulation. Defining Pt =
+ Y − βt X − αt as the positive deviations of Y with respect to their predicted quan
− til
Data Loading...