New stepsizes for the gradient method
- PDF / 593,999 Bytes
- 13 Pages / 439.37 x 666.142 pts Page_size
- 31 Downloads / 239 Views
New stepsizes for the gradient method Cong Sun1
· Jin-Peng Liu2,3
Received: 15 January 2018 / Accepted: 22 November 2019 © Springer-Verlag GmbH Germany, part of Springer Nature 2019
Abstract Gradient methods are famous for their simplicity and low complexity, which attract more and more attention for large scale optimization problems. A good stepsize plays an important role to construct an efficient gradient method. This paper proposes a new framework to generate stepsizes for gradient methods applied to convex quadratic function minimization problems. By adopting different criterions, we propose four new gradient methods. For 2-dimensional unconstrained problems with convex quadratic objective functions, we prove that the new methods either terminate in finite iterations or converge R-superlinearly; for n-dimensional problems, we prove that all the new methods converge R-linearly. Numerical experiments show that the new methods enjoy lower complexity and outperform the existing gradient methods. Keywords Gradient method · Steepest descent · R-linear convergence rate · Finite termination
1 Introduction The gradient method plays an important role in solving large scale optimization problems. It is widely used in many applications, e.g., machine learning and inverse problems [24]. Thus it is necessary to exploit efficient stepsizes for gradient methods and their theoretical analysis.
This work is partially supported by NSFC Grants 11771056, 91630202 and 11871115, and the Young Elite Scientists Sponsorship Program by CAST 2017QNRC001.
B
Cong Sun [email protected] Jin-Peng Liu [email protected]
1
School of Science, Beijing University of Posts and Telecommunications, Beijing 100876, China
2
School of Mathematics and Systems Science, Beihang University, Beijing, China
3
Present Address: Department of Mathematics, University of Maryland, College Park, MD 20742, USA
123
C. Sun, J.-P. Liu
Consider the unconstrained convex quadratic optimization problem: 1 minn f (x) = b T x + x T H x, x∈R 2
(1)
where b ∈ Rn , and H ∈ Rn×n is symmetric positive definite. We want to find the minimal value of f (x) by means of gradient methods. Let xk be the iterate point achieved from the kth iteration, and gk = ∇ f (xk ) = H xk + b be the gradient vector at xk . The iteration formulation of the gradient method is: xk+1 = xk − αk gk .
(2)
Different choices of the stepsize αk lead to various performances. One particular choice is the exact linesearch, which is also called Cauchy stepsize [2,3]: αk∗ = argminα>0 f (xk − αgk ). It has the closed form as: αk∗ =
gk 22 gkT H gk
.
By using exact linesearch steps, the gradient method becomes the steepest descent method. It has linear convergence rate [23], but suffers from the zigzag phenomenon. Another choice of stepsize was proposed by Barzilai and Borwein [1]. The BB stepsizes are designed as the approximation of the inverse of the Hessian matrix, and have certain quasi-Newton property. This provides two BB stepsizes as: αkB B1 = αkB B2 =
skT sk skT yk skT yk
Data Loading...