A New Analysis on the Barzilai-Borwein Gradient Method

PDF / 469,109 Bytes
12 Pages / 439.37 x 666.142 pts Page_size
80 Downloads / 221 Views

A New Analysis on the Barzilai-Borwein Gradient Method Yu-Hong Dai

Received: 30 October 2012 / Revised: 22 February 2013 / Accepted: 26 February 2013 / Published online: 14 March 2013 © Operations Research Society of China, Periodicals Agency of Shanghai University, and Springer-Verlag Berlin Heidelberg 2013

Abstract Due to its simplicity and efficiency, the Barzilai and Borwein (BB) gradient method has received various attentions in different fields. This paper presents a new analysis of the BB method for two-dimensional strictly convex quadratic functions. The analysis begins with the assumption that the gradient norms at the first two iterations are fixed. We show that there is a superlinear convergence step in at most three consecutive steps. Meanwhile, we provide a better convergence relation for the BB method. The influence of the starting point and the condition number to the convergence rate is comprehensively addressed. Keywords Unconstrained optimization · Barzilai and Borwein gradient method · Quadratic function · R-superlinear convergence · Condition number

1 Introduction Consider the problem of minimizing a strictly convex quadratic, 1 min f (x) = xT Ax − bT x, 2

(1.1)

where A ∈ R n×n is a real symmetric positive definite matrix and b ∈ R n . The Barzilai and Borwein (BB) method for solving (1.1) takes the negative gradient as its search

This work was partly supported by the Chinese NSF grants (Nos. 10831106 and 81173633), the CAS grant (No. kjcx-yw-s7-03) and the China National Funds for Distinguished Young Scientists (No. 11125107). Y.-H. Dai () State Key Laboratory of Scientific and Engineering Computing, Institute of Computational Mathematics and Scientific/Engineering Computing, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, P.O. Box 2719, Beijing 100190, P.R. China e-mail: [email protected]

188

Y.-H. Dai

direction and updates the solution approximation iteratively by xk+1 = xk − αk gk ,

(1.2)

where gk = ∇f (xk ) and αk is determined by the information achieved at the points xk−1 and xk . Specifically, denote sk−1 = xk − xk−1 and yk−1 = gk − gk−1 . Since the matrix Dk = αk−1 I , where I is the identity matrix, can be regarded as an approximation to the Hessian of f at xk , Barzilai and Borwein [2] chose the stepsize αk such that Dk has certain quasi-Newton property: Dk = arg min Dsk−1 − yk−1 , D=α −1 I

(1.3)

where and below · means the two norm, yielding αk =

sTk−1 sk−1 sTk−1 yk−1

.

(1.4)

Comparing with the classical steepest descent (SD) method by Cauchy [4], which takes the stepsize as the exact one-dimensional minimizer along xk − α gk , αkSD = arg min f (xk − αgk ), α>0

(1.5)

the BB method often requires less computational work and speeds up the convergence greatly. Consequently, due to its simplicity and efficiency, the BB method has been extended or utilized in many occasions or applications. To mention just a few of them, Raydan [11] proposed an efficient global Barzilai and Borwein algorithm for unconstrained optimization by combining the

Data Loading...

A New Analysis on the Barzilai-Borwein Gradient Method

Recommend Documents

New stepsizes for the gradient method

The Conjugate Gradient Method

A Gradient Method for Elastic-Plastic Analysis of Structures

A new characterization method of the microstructure using the macroscopic composition gradient in alloys

Gradient Projection Method on Matrix Manifolds

A Flow Control Method Based on Deep Deterministic Policy Gradient

Conjugate gradient method

Conjugate Gradient Method

Gradient Descent Method Based on the Multidimensional Voxel Images

A new method based on the proximal bundle idea and gradient sampling technique for minimizing nonsmooth convex functions

On the Gradient Projection Method for Weakly Convex Functions on a Proximally Smooth Set

Properties of the delayed weighted gradient method