Lower bounds for finding stationary points I
- PDF / 962,175 Bytes
- 50 Pages / 439.37 x 666.142 pts Page_size
- 16 Downloads / 222 Views
Series A
Lower bounds for finding stationary points I Yair Carmon1
· John C. Duchi2 · Oliver Hinder3 · Aaron Sidford3
Received: 13 December 2017 / Accepted: 27 May 2019 © Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society 2019
Abstract We prove lower bounds on the complexity of finding -stationary points (points x such that ∇ f (x) ≤ ) of smooth, high-dimensional, and potentially non-convex functions f . We consider oracle-based complexity measures, where an algorithm is given access to the value and all derivatives of f at a query point x. We show that for any (potentially randomized) algorithm A, there exists a function f with Lipschitz pth order derivatives such that A requires at least −( p+1)/ p queries to find an -stationary point. Our lower bounds are sharp to within constants, and they show that gradient descent, cubic-regularized Newton’s method, and generalized pth order regularization are worst-case optimal within their natural function classes. Keywords Non-convex optimization · Information-based complexity · Dimension-free rates · Gradient descent · Cubic regularization of Newton’s method Mathematics Subject Classification 90C06 · 90C26 · 90C30 · 90C60 · 68Q25
OH was supported by the PACCAR INC fellowship. YC and JCD were partially supported by the SAIL-Toyota Center for AI Research, NSF-CAREER award 1553086, and a Sloan Foundation Fellowship in Mathematics. YC was partially supported by the Stanford Graduate Fellowship and the Numerical Technologies Fellowship.
B
Yair Carmon [email protected] John C. Duchi [email protected] Oliver Hinder [email protected] Aaron Sidford [email protected]
1
Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA
2
Departments of Statistics and Electrical Engineering, Stanford University, Stanford, CA 94305, USA
3
Department of Management Science and Engineering, Stanford University, Stanford, CA 94305, USA
123
Y. Carmon et al.
1 Introduction Consider the optimization problem minimize f (x) x∈Rd
where f : Rd → R is smooth, but possibly non-convex. In general, it is intractable to even approximately minimize such f [33,35], so—following an established line of research—we consider the problem of finding an -stationary point of f , meaning some x ∈ Rd such that ∇ f (x) ≤ .
(1)
We prove lower bounds on the number of function and derivative evaluations required for algorithms to find a point x satisfying inequality (1). While for arbitrary smooth f , a near-stationary point (1) is certainly insufficient for any type of optimality, there are a number of reasons to study algorithms and complexity for finding stationary points. In several statistical and engineering problems, including regression models with non-convex penalties and objectives [30,31], phase retrieval [12,42], and nonconvex (low-rank) reformulations of semidefinite programs and matrix completion [8,11,27], it is possible to show that all first- or second-order stationary points are (near) global minima. Th
Data Loading...