A New Analysis on the Barzilai-Borwein Gradient Method
- PDF / 469,109 Bytes
- 12 Pages / 439.37 x 666.142 pts Page_size
- 80 Downloads / 212 Views
A New Analysis on the Barzilai-Borwein Gradient Method Yu-Hong Dai
Received: 30 October 2012 / Revised: 22 February 2013 / Accepted: 26 February 2013 / Published online: 14 March 2013 © Operations Research Society of China, Periodicals Agency of Shanghai University, and Springer-Verlag Berlin Heidelberg 2013
Abstract Due to its simplicity and efficiency, the Barzilai and Borwein (BB) gradient method has received various attentions in different fields. This paper presents a new analysis of the BB method for two-dimensional strictly convex quadratic functions. The analysis begins with the assumption that the gradient norms at the first two iterations are fixed. We show that there is a superlinear convergence step in at most three consecutive steps. Meanwhile, we provide a better convergence relation for the BB method. The influence of the starting point and the condition number to the convergence rate is comprehensively addressed. Keywords Unconstrained optimization · Barzilai and Borwein gradient method · Quadratic function · R-superlinear convergence · Condition number
1 Introduction Consider the problem of minimizing a strictly convex quadratic, 1 min f (x) = xT Ax − bT x, 2
(1.1)
where A ∈ R n×n is a real symmetric positive definite matrix and b ∈ R n . The Barzilai and Borwein (BB) method for solving (1.1) takes the negative gradient as its search
This work was partly supported by the Chinese NSF grants (Nos. 10831106 and 81173633), the CAS grant (No. kjcx-yw-s7-03) and the China National Funds for Distinguished Young Scientists (No. 11125107). Y.-H. Dai () State Key Laboratory of Scientific and Engineering Computing, Institute of Computational Mathematics and Scientific/Engineering Computing, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, P.O. Box 2719, Beijing 100190, P.R. China e-mail: [email protected]
188
Y.-H. Dai
direction and updates the solution approximation iteratively by xk+1 = xk − αk gk ,
(1.2)
where gk = ∇f (xk ) and αk is determined by the information achieved at the points xk−1 and xk . Specifically, denote sk−1 = xk − xk−1 and yk−1 = gk − gk−1 . Since the matrix Dk = αk−1 I , where I is the identity matrix, can be regarded as an approximation to the Hessian of f at xk , Barzilai and Borwein [2] chose the stepsize αk such that Dk has certain quasi-Newton property: Dk = arg min Dsk−1 − yk−1 , D=α −1 I
(1.3)
where and below · means the two norm, yielding αk =
sTk−1 sk−1 sTk−1 yk−1
.
(1.4)
Comparing with the classical steepest descent (SD) method by Cauchy [4], which takes the stepsize as the exact one-dimensional minimizer along xk − α gk , αkSD = arg min f (xk − αgk ), α>0
(1.5)
the BB method often requires less computational work and speeds up the convergence greatly. Consequently, due to its simplicity and efficiency, the BB method has been extended or utilized in many occasions or applications. To mention just a few of them, Raydan [11] proposed an efficient global Barzilai and Borwein algorithm for unconstrained optimization by combining the
Data Loading...