Newton-type methods for non-convex optimization under inexact Hessian information

  • PDF / 587,868 Bytes
  • 36 Pages / 439.37 x 666.142 pts Page_size
  • 77 Downloads / 193 Views

DOWNLOAD

REPORT


Series A

Newton-type methods for non-convex optimization under inexact Hessian information Peng Xu1 · Fred Roosta2,3 · Michael W. Mahoney3,4 Received: 27 November 2017 / Accepted: 16 May 2019 © Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society 2019

Abstract We consider variants of trust-region and adaptive cubic regularization methods for non-convex optimization, in which the Hessian matrix is approximated. Under certain condition on the inexact Hessian, and using approximate solution of the corresponding sub-problems, we provide iteration complexity to achieve ε-approximate second-order optimality which have been shown to be tight. Our Hessian approximation condition offers a range of advantages as compared with the prior works and allows for direct construction of the approximate Hessian with a priori guarantees through various techniques, including randomized sampling methods. In this light, we consider the canonical problem of finite-sum minimization, provide appropriate uniform and non-uniform sub-sampling strategies to construct such Hessian approximations, and obtain optimal iteration complexity for the corresponding sub-sampled trust-region and adaptive cubic regularization methods. Keywords Non-convex optimization · Inexact Hessian · Trust region · Cubic regularization · Randomized numerical linear algebra Mathematics Subject Classification 49M15 · 65K05 · 90C25 · 90C06

B

Fred Roosta [email protected] Peng Xu [email protected] Michael W. Mahoney [email protected]

1

Institute for Computational and Mathematical Engineering, Stanford University, Stanford, USA

2

School of Mathematics and Physics, University of Queensland, Brisbane, Australia

3

International Computer Science Institute, Berkeley, USA

4

Department of Statistics, University of California at Berkeley, Berkeley, USA

123

P. Xu et al.

1 Introduction Consider the generic unconstrained optimization problem min F(x),

x∈Rd

(P0)

where F : Rd → R is smooth and non-convex. Faced with the large-scale nature of modern “big-data” problems, many of the classical optimization algorithms might prove to be inefficient, if applicable at all. In this light, many of the recent research efforts have been centered around designing variants of classical algorithms which, by employing suitable approximations of the gradient and/or Hessian, improve upon the cost-per-iteration, while maintaining the original iteration complexity. In this light, we focus on trust-region (TR) [17] and cubic regularization (CR) [34], two algorithms which are considered as among the most elegant and theoretically sound generalpurpose Newton-type methods for non-convex problems. In doing so, we first consider (P0), and study the theoretical convergence properties of variants of these two algorithms in which, under favorable conditions, Hessian is suitably approximated. We show that our Hessian approximation conditions, in many cases, are weaker than the existing ones in the literature. In addition, and in contrast to som