Globalized inexact proximal Newton-type methods for nonconvex composite functions

  • PDF / 2,775,951 Bytes
  • 34 Pages / 439.37 x 666.142 pts Page_size
  • 52 Downloads / 233 Views

DOWNLOAD

REPORT


Globalized inexact proximal Newton‑type methods for nonconvex composite functions Christian Kanzow1   · Theresa Lechner1 Received: 27 February 2020 / Accepted: 26 October 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Optimization problems with composite functions consist of an objective function which is the sum of a smooth and a (convex) nonsmooth term. This particular structure is exploited by the class of proximal gradient methods and some of their generalizations like proximal Newton and quasi-Newton methods. The current literature on these classes of methods almost exclusively considers the case where also the smooth term is convex. Here we present a globalized proximal Newton-type method which allows the smooth term to be nonconvex. The method is shown to have nice global and local convergence properties, and some numerical results indicate that this method is very promising also from a practical point of view.

1 Introduction In this paper, we deal with the composite problem

min 𝜓(x) ∶= f (x) + 𝜑(x),

(1)

x∈ℝn

where f ∶ ℝn → ℝ is (twice) continuously differentiable and 𝜑 ∶ ℝn → ℝ ∪ {+∞} is convex, proper, and lower semicontinuous (lsc). In this formulation, the objective function 𝜓 is neither convex nor smooth, so it covers a wide class of applications as described below. Since 𝜑 is allowed to take the value +∞ , (1) also comprises constrained problems on convex sets.

* Christian Kanzow [email protected]‑wuerzburg.de Theresa Lechner [email protected]‑wuerzburg.de 1



University of Würzburg, Institute of Mathematics, Emil‑Fischer‑Str. 30, 97074 Würzburg, Germany

13

Vol.:(0123456789)



C. Kanzow, T. Lechner

1.1 Background Optimization problems in the form (1) arise in many applications in statistics, machine learning, compressed sensing, and signal processing. Common applications are the lasso [42] and related problems, where the func2 tion f represents a smooth loss function such�as the quadratic � loss f (x) ∶= ‖Ax − b‖2 1 ∑m T or the logistic loss f (x) ∶= m i=1 log 1 + exp(ai x) for some given data A ∈ ℝm×n , b ∈ ℝm , and ai ∈ ℝn for i = 1, … , m . A convex regularizer 𝜑 is added to involve some additional constraints or to control some sparsity. Typical regularizers ∑ are the 𝓁1 - and 𝓁2-norm, a weighted 𝓁1-norm 𝜑(x) ∶= ni=1 𝜔i �xi � for some weights ∑n−1 𝜔i > 0 , or the total variation 𝜑(x) = ‖∇x‖ ∶= i=1 �xi+1 − xi � . Loss problems are typically used to reconstruct blurred or incomplete data or to classify data. Another type of application are inverse covariance estimation problems [3, 46]. The aim of this problem class is to find the (sparse) inverse covariance matrix of a probability distribution of identically and independently distributed samples. For further applications, where the function f is assumed to be convex, we refer to the list given by Combettes and Ways [17] and references therein. Further problems in the form (1) are constrained problems [10] arising in the above mentioned fields. Nonconvex applications occur, e.g., in