Accelerated Methods for Saddle-Point Problem
- PDF / 875,176 Bytes
- 23 Pages / 612 x 792 pts (letter) Page_size
- 59 Downloads / 201 Views
L CONTROL
Accelerated Methods for Saddle-Point Problem M. S. Alkousaa,b,*, A. V. Gasnikova,b,c,**, D. M. Dvinskikhc,d,***, D. A. Kovaleve,****, and F. S. Stonyakinf,a,***** a
Moscow University of Physics and Technology, Dolgoprudny, Moscow oblast, 141700 Russia b National Research University Higher School of Economics, Moscow, 101000 Russia c Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, 127051 Russia d Weierstrass Institute for Applied Analysis and Stochastics, Berlin, 10117 Germany e King Abdullah University of Science and Technology, Thuwal, 23955 Saudi Arabia f Crimean Federal University, Simferopol, 295007 Russia *e-mail: [email protected] **e-mail: [email protected] ***e-mail: [email protected] ****e-mail: [email protected] *****e-mail: [email protected] Received December 1, 2019; revised December 20, 2019; accepted July 7, 2020
Abstract—Recently, it has been shown how, on the basis of the usual accelerated gradient method for solving problems of smooth convex optimization, accelerated methods for more complex problems (with a structure) and problems that are solved using various local information about the behavior of a function (stochastic gradient, Hessian, etc.) can be obtained. The term “accelerated methods” here means, on the one hand, the presence of some unified and fairly general way of acceleration. On the other hand, this also means the optimality of the methods, which can often be proved rigorously. In the present work, an attempt is made to construct in the same way a theory of accelerated methods for solving smooth convex-concave saddle-point problems with a structure. The main result of this article is the obtainment of in some sense necessary and sufficient conditions under which the complexity of solving nonlinear convex-concave saddle-point problems with a structure in the number of calculations of the gradients of composites in direct variables is equal in order of magnitude to the complexity of solving bilinear problems with a structure. Keywords: saddle problem, accelerated method, sliding, prox-friendly function DOI: 10.1134/S0965542520110020
1. INTRODUCTION One of the main directions of research in numerical methods of convex optimization in the last decade has been the widespread extension of the technique for the acceleration of the accelerated gradient method, proposed in 1983 by Yu. Nesterov [1], to other numerical optimization methods. Let us briefly recall the original technique. To solve the convex optimization problem
f ( x) → minn ,
(1)
x∈R
it was suggested to replace the standard gradient descent method
x k +1 = x k − 1 ∇f ( x k ), k ≥ 0, L where L is the upper bound for the Lipschitz constant of the gradient of f ( x ) in the 2 -norm (hereinafter, for brevity, we will say that f ( x) has an L -Lipschitz gradient), with the following accelerated method:
x k +1
x1 = x 0 − 1 ∇f ( x 0 ), L k k k = x − 1 ∇f x + k − 1 ( x − x k −1) + k − 1 ( x k − x k −1), L k+2 k+2
(
)
1787
k ≥ 1.
1788
ALKOUS
Data Loading...