Lower bounds for finding stationary points I
- PDF / 962,175 Bytes
- 50 Pages / 439.37 x 666.142 pts Page_size
- 18 Downloads / 269 Views
		    Series A
 
 Lower bounds for finding stationary points I Yair Carmon1
 
 · John C. Duchi2 · Oliver Hinder3 · Aaron Sidford3
 
 Received: 13 December 2017 / Accepted: 27 May 2019 © Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society 2019
 
 Abstract We prove lower bounds on the complexity of finding -stationary points (points x such that ∇ f (x) ≤ ) of smooth, high-dimensional, and potentially non-convex functions f . We consider oracle-based complexity measures, where an algorithm is given access to the value and all derivatives of f at a query point x. We show that for any (potentially randomized) algorithm A, there exists a function f with Lipschitz pth order derivatives such that A requires at least  −( p+1)/ p queries to find an -stationary point. Our lower bounds are sharp to within constants, and they show that gradient descent, cubic-regularized Newton’s method, and generalized pth order regularization are worst-case optimal within their natural function classes. Keywords Non-convex optimization · Information-based complexity · Dimension-free rates · Gradient descent · Cubic regularization of Newton’s method Mathematics Subject Classification 90C06 · 90C26 · 90C30 · 90C60 · 68Q25
 
 OH was supported by the PACCAR INC fellowship. YC and JCD were partially supported by the SAIL-Toyota Center for AI Research, NSF-CAREER award 1553086, and a Sloan Foundation Fellowship in Mathematics. YC was partially supported by the Stanford Graduate Fellowship and the Numerical Technologies Fellowship.
 
 B
 
 Yair Carmon [email protected] John C. Duchi [email protected] Oliver Hinder [email protected] Aaron Sidford [email protected]
 
 1
 
 Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA
 
 2
 
 Departments of Statistics and Electrical Engineering, Stanford University, Stanford, CA 94305, USA
 
 3
 
 Department of Management Science and Engineering, Stanford University, Stanford, CA 94305, USA
 
 123
 
 Y. Carmon et al.
 
 1 Introduction Consider the optimization problem minimize f (x) x∈Rd
 
 where f : Rd → R is smooth, but possibly non-convex. In general, it is intractable to even approximately minimize such f [33,35], so—following an established line of research—we consider the problem of finding an -stationary point of f , meaning some x ∈ Rd such that ∇ f (x) ≤ .
 
 (1)
 
 We prove lower bounds on the number of function and derivative evaluations required for algorithms to find a point x satisfying inequality (1). While for arbitrary smooth f , a near-stationary point (1) is certainly insufficient for any type of optimality, there are a number of reasons to study algorithms and complexity for finding stationary points. In several statistical and engineering problems, including regression models with non-convex penalties and objectives [30,31], phase retrieval [12,42], and nonconvex (low-rank) reformulations of semidefinite programs and matrix completion [8,11,27], it is possible to show that all first- or second-order stationary points are (near) global minima. Th		
Data Loading...
 
	 
	 
	 
	 
	 
	 
	 
	 
	 
	 
	