Variable Smoothing for Convex Optimization Problems Using Stochastic Gradients

  • PDF / 986,866 Bytes
  • 29 Pages / 439.37 x 666.142 pts Page_size
  • 86 Downloads / 192 Views

DOWNLOAD

REPORT


Variable Smoothing for Convex Optimization Problems Using Stochastic Gradients Radu Ioan Bo¸t1

· Axel Böhm1

Received: 16 May 2019 / Revised: 12 March 2020 / Accepted: 3 October 2020 © The Author(s) 2020

Abstract We aim to solve a structured convex optimization problem, where a nonsmooth function is composed with a linear operator. When opting for full splitting schemes, usually, primal–dual type methods are employed as they are effective and also well studied. However, under the additional assumption of Lipschitz continuity of the nonsmooth function which is composed with the linear operator we can derive novel algorithms through regularization via the Moreau envelope. Furthermore, we tackle large scale problems by means of stochastic oracle calls, very similar to stochastic gradient techniques. Applications to total variational denoising and deblurring, and matrix factorization are provided. Keywords Structured convex optimization problem · Variable smoothing algorithm · Convergence rate · Stochastic gradients Mathematics Subject Classification 90C25 · 90C15 · 65Y20

1 Introduction The problem at hand is the following structured convex optimization problem min f (x) + g(K x),

(1)

x∈H

for real Hilbert spaces H and G , f : H → R := R ∪ {±∞} a proper, convex and lower semicontinuous function, g : G → R a, possibly nonsmooth, convex and Lipschitz continuous function, and K : H → G a linear continuous operator.

Research partially supported by FWF (Austrian Science Fund) project I 2419-N32. Research supported by the doctoral programme Vienna Graduate School on Computational Optimization (VGSCO), FWF (Austrian Science Fund), Project W 1260.

B

Radu Ioan Bo¸t [email protected] Axel Böhm [email protected]

1

Faculty of Mathematics, University of Vienna, Oskar-Morgenstern-Platz 1, 1090 Vienna, Austria 0123456789().: V,-vol

123

33

Page 2 of 29

Journal of Scientific Computing

(2020) 85:33

Our aim will be to devise an algorithm for solving (1) following the full splitting paradigm (see [5,6,8,9,15,17,29]). In other words, we allow only proximal evaluations for simple nonsmooth functions, but no proximal evaluations for compositions with linear continuous operators, like, for instance, for g ◦ K . We will accomplish this feat by the means of a smoothing strategy, which, for the purpose of this paper, means, making use of the Moreau-Yosida approximation. The approach can be described as follows: we “smooth” g, i.e. we replace it by its Moreau envelope, and solve the resulting optimization problem by an accelerated proximal-gradient algorithm (see [3,13,21]).  This approach is similar to those in [7,10,11,20,22], where a convergence rate of O log(k) is proved. However, our techniques (for the deterministic case) resemble k more the ones in [28], where an improved rate of O( k1 ) is shown, with the most notable difference to our work is that we use a simpler stepsize and treat the stochastic case. The only other family of methods able to solve problems of type (1) are the so called primal–dual algorithm