On the interplay between acceleration and identification for the proximal gradient algorithm

PDF / 3,034,201 Bytes
28 Pages / 439.37 x 666.142 pts Page_size
28 Downloads / 180 Views

On the interplay between acceleration and identification for the proximal gradient algorithm Gilles Bareilles1 · Franck Iutzeler1 Received: 19 September 2019 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract In this paper, we study the interplay between acceleration and structure identification for the proximal gradient algorithm. While acceleration is generally beneficial in terms of functional decrease, we report and analyze several cases where its interplay with identification has negative effects on the algorithm behavior (iterates oscillation, loss of structure, etc.). Then, we present a generic method that tames acceleration when structure identification may be at stake; it benefits from a convergence rate that matches the one of the accelerated proximal gradient under some qualifying condition. We show empirically that the proposed method is much more stable in terms of subspace identification compared to the accelerated proximal gradient method while keeping a similar functional decrease. Keywords Proximal gradient · Accelerated methods · Identification

1 Introduction In this paper, we consider composite optimization problems of the form

min F(x) ∶= f (x) + g(x)

(1.1)

x∈ℝn

where f and g are convex functions. We are interested in the case where f is smooth (i.e. differentiable with a Lipschitz-continuous gradient) while g is nonsmooth and enforces some structure to the solutions of (1.1) (e.g. belonging to some set, sparsity, etc.). This type of problem often appears in signal processing and machine learning in which case the goal is usually to find a point with a low error in the task at hand while maintaining some structure. For instance, taking g as the 𝓁1 norm promotes * Franck Iutzeler Franck.Iutzeler@univ‑grenoble‑alpes.fr Gilles Bareilles Gilles.Bareilles@univ‑grenoble‑alpes.fr 1

Laboratoire Jean Kuntzmann, Univ. Grenoble Alpes, Saint‑Martin‑d’Hères, France

13

Vol.:(0123456789)

G. Bareilles, F. Iutzeler

sparsity, it is commonly used in the recovery of sparse signals or compressed sensing [13]; we refer the reader to [40] for an overview. Problem (1.1) is typically solved by the proximal gradient algorithm, a special case of Forward-Backward splitting, which alternates a gradient step on the smooth function f and a proximal operator on the nonsmooth function g; see e.g. [4, Chap. 27]. Moreover, in order to improve the convergence of this algorithm, accelerated (also called fast or inertial) versions have been widely promoted, notably thanks to the popularity of FISTA [7]. These modifications consist in producing the next iterate by linearly combining the previous outputs of the proximal operator. Using combination coefficients as per Nesterov’s fast method [31] (or similar ones [3, 15]) improves practically and theoretically the rate of convergence of the proximal gradient (nonetheless to the price of a more involved analysis, see e.g. [1, 15, 30]). Additionally, the handling of the nonsmooth function g by a proximity operator enforces some structure on

Data Loading...

On the interplay between acceleration and identification for the proximal gradient algorithm

Recommend Documents

Convergence of Stochastic Proximal Gradient Algorithm

On the Interplay Between Ecology and Reliability

The Interplay Between Exosomes and Spermatozoa

The Interplay Between the GDPR and the IDD

The Dynamic Interplay between Context and the Language Learner

Parallel Acceleration Improvement of Physical Quantity Gradient Algorithm Based on CFD

High-Resolution Source Localization Algorithm Based on the Conjugate Gradient

Will Miami Survive? The Dynamic Interplay between Floods and Finance

The Interplay Between Oviduct-Derived Exosomes and Cumulus-Oocyte Complexes

On the Convergence Rate of a Proximal Point Algorithm for Vector Function on Hadamard Manifolds

The Entrepreneurial Paradox Examining the Interplay between Entrepre

Superiorization methodology and perturbation resilience of inertial proximal gradient algorithm with application to sign