Properties of the delayed weighted gradient method
- PDF / 1,265,985 Bytes
- 14 Pages / 439.37 x 666.142 pts Page_size
- 66 Downloads / 232 Views
Properties of the delayed weighted gradient method Roberto Andreani1 · Marcos Raydan2 Received: 6 June 2020 / Accepted: 19 September 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract The delayed weighted gradient method, recently introduced in Oviedo-Leon (Comput Optim Appl 74:729–746, 2019), is a low-cost gradient-type method that exhibits a surprisingly and perhaps unexpected fast convergence behavior that competes favorably with the well-known conjugate gradient method for the minimization of convex quadratic functions. In this work, we establish several orthogonality properties that add understanding to the practical behavior of the method, including its finite termination. We show that if the n × n real Hessian matrix of the quadratic function has only p < n distinct eigenvalues, then the method terminates in p iterations. We also establish an optimality condition, concerning the gradient norm, that motivates the future use of this novel scheme when low precision is required for the minimization of non-quadratic functions. Keywords Gradient methods · Conjugate gradient methods · Smoothing techniques · Finite termination · Krylov subspace methods
1 Introduction Recently [13], Oviedo proposed a low-cost method for the minimization of largescale convex quadratic functions, which is based on a smoothing technique combined with a one-step delayed gradient method. For gradient-type methods, smoothing techniques were previously developed [1, 11], as well as delayed schemes [7, 12]. A skillful combination of these independent ideas produces the so-called delayed weighted gradient method (DWGM), which exhibits an impressive fast * Marcos Raydan [email protected] Roberto Andreani [email protected] 1
Department of Applied Mathematics, IMECC‑UNICAMP, University of Campinas, Rua Sérgio Buarque de Holanda, 651 Cidade Universitária “Zeferino Vaz”, Distrito Barão Geraldo, Campinas, SP 13083‑859, Brazil
2
Centro de Matemática e Aplicações (CMA), FCT, UNL, 2829‑516 Caparica, Portugal
13
Vol.:(0123456789)
R. Andreani, M. Raydan
Table 1 Gradient norm at each iteration (Iter i) of DWGM, CG, and CR, from the same initial point, for the minimization of a strictly convex quadratic function when n = 5 , and the eigenvalues of A are given by 𝜆i = i , for 1 ≤ i ≤ 5 Method
Iter 1
Iter 2
Iter 3
Iter 4
Iter 5
CG
1.3347
0.6458
0.2418
0.0368
0.0
CR
1.2934
0.5778
0.2231
0.0363
0.0
DWGM
0.6679
0.2259
0.0442
0.0064
0.0
10
10
0
-2
Gradient norm
Gradient norm
10
2
-4
10
-6
10
CG DWGM
-8
1
10
2
10
0
-2
10
-4
10
-6
10
CG
-8
DWGM
-10
10
-10
10
4
10
CR
10
10
CR
-12
2
3
4
5
6
7
Iterations
8
9
10
11
10
0
10
20
30
40
50
60
Iterations
Fig. 1 Convergence history of DWGM, CG, and CR, from the same initial point, for the minimization of a strictly convex quadratic function. On the left, n = 11 and we use the following distribution of eigenvalues of A: 𝜆1 = 0.1 , 𝜆i = i for 2 ≤ i ≤ 10 , and 𝜆11 = 1000 . On the right, n
Data Loading...