First-order optimization algorithms via inertial systems with Hessian driven damping

  • PDF / 804,575 Bytes
  • 43 Pages / 439.37 x 666.142 pts Page_size
  • 97 Downloads / 189 Views

DOWNLOAD

REPORT


Series A

First-order optimization algorithms via inertial systems with Hessian driven damping Hedy Attouch1 · Zaki Chbani2 · Jalal Fadili3 · Hassan Riahi2 Received: 24 July 2019 / Accepted: 3 November 2020 © Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society 2020

Abstract In a Hilbert space setting, for convex optimization, we analyze the convergence rate of a class of first-order algorithms involving inertial features. They can be interpreted as discrete time versions of inertial dynamics involving both viscous and Hessian-driven dampings. The geometrical damping driven by the Hessian intervenes in the dynamics ˙ By treating this term as the time derivative of ∇ f (x(t)), in the form ∇ 2 f (x(t))x(t). this gives, in discretized form, first-order algorithms in time and space. In addition to the convergence properties attached to Nesterov-type accelerated gradient methods, the algorithms thus obtained are new and show a rapid convergence towards zero of the gradients. On the basis of a regularization technique using the Moreau envelope, we extend these methods to non-smooth convex functions with extended real values. The introduction of time scale factors makes it possible to further accelerate these algorithms. We also report numerical results on structured problems to support our theoretical findings. Keywords Hessian driven damping · Inertial optimization algorithms · Nesterov accelerated gradient method · Ravine method · Time rescaling

B

Jalal Fadili [email protected] Hedy Attouch [email protected] Zaki Chbani [email protected] Hassan Riahi [email protected]

1

IMAG, Univ. Montpellier, CNRS, Montpellier, France

2

Faculty of Sciences Semlalia, Mathematics, Cadi Ayyad University, 40000 Marrakech, Morocco

3

Normandie Université-ENSICAEN, CNRS, GREYC, Caen, France

123

H. Attouch et al.

Mathematics Subject Classification 37N40 · 46N10 · 49M30 · 65B99 · 65K05 · 65K10 · 90B50 · 90C25

1 Introduction Unless specified, throughout the paper we make the following assumptions ⎧ ⎪ ⎨H is a real Hilbert space; f : H → R is a convex function of class C 2 , S := argminH f = ∅; ⎪ ⎩ γ , β, b : [t0 , +∞[→ R+ are non-negative continuous functions, t0 > 0.

(H)

As a guide in our study, we will rely on the asymptotic behavior, when t → +∞, of the trajectories of the inertial system with Hessian-driven damping ˙ + b(t)∇ f (x(t)) = 0, x(t) ¨ + γ (t)x(t) ˙ + β(t)∇ 2 f (x(t))x(t) γ (t) and β(t) are damping parameters, and b(t) is a time scale parameter. The time discretization of this system will provide a rich family of first-order methods for minimizing f . At first glance, the presence of the Hessian may seem to entail numerical difficulties. However, this is not the case as the Hessian intervenes in ˙ which is nothing but the derivative w.r.t. the above ODE in the form ∇ 2 f (x(t))x(t), time of ∇ f (x(t)). This explains why the time discretization of this dynamic provides first-order algorithms. Thus, the Nesterov extrapolation scheme [25,26] is modified b