Convergence rates for an inexact ADMM applied to separable convex optimization
- PDF / 2,441,015 Bytes
- 26 Pages / 439.37 x 666.142 pts Page_size
- 15 Downloads / 142 Views
Convergence rates for an inexact ADMM applied to separable convex optimization William W. Hager1 · Hongchao Zhang2 Received: 13 January 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract Convergence rates are established for an inexact accelerated alternating direction method of multipliers (I-ADMM) for general separable convex optimization with a linear constraint. Both ergodic and non-ergodic iterates are analyzed. Relative to the iteration number k, the convergence rate is O(1∕k) in a convex setting and O(1∕k2 ) in a strongly convex setting. When an error bound condition holds, the algorithm is 2-step linearly convergent. The I-ADMM is designed so that the accuracy of the inexact iteration preserves the global convergence rates of the exact iteration, leading to better numerical performance in the test problems. Keywords Separable convex optimization · Alternating direction method of multipliers · ADMM · Accelerated gradient method · Inexact methods · Global convergence · Convergence rates Mathematics Subject Classification 90C06 · 90C25 · 65Y20
1 Introduction We consider a convex, separable linearly constrained optimization problem The authors gratefully acknowledge support by the National Science Foundation under Grants 1819002 and 1819161, and by the Office of Naval Research under Grants N00014-15-1-2048 and N00014-18-1-2100. * William W. Hager [email protected] http://people.clas.ufl.edu/hager/ Hongchao Zhang [email protected] http://math.lsu.edu/~hozhang/ 1
Department of Mathematics, University of Florida, PO Box 118105, Gainesville, FL 32611‑8105, USA
2
Department of Mathematics, Louisiana State University, Baton Rouge, LA 70803‑4918, USA
13
Vol.:(0123456789)
W. W. Hager, H. Zhang
(1.1)
min Φ(𝐱) subject to 𝐀𝐱 = 𝐛,
where Φ ∶ ℝn → ℝ ∪ {∞} and 𝐀 is N by n. By a separable convex problem, we mean that the objective function is a sum of m independent parts, and the matrix is partitioned compatibly as in
Φ(𝐱) =
m ∑ i=1
fi (𝐱i ) + hi (𝐱i )
and
𝐀𝐱 =
m ∑
𝐀i 𝐱i .
(1.2)
i=1
Here fi is convex and Lipschitz continuously differentiable, hi is a proper closed con∑ vex function (possibly nonsmooth), and 𝐀i is N by ni with m n = n . There is no i=1 i column independence assumption for the 𝐀i . Constraints of the form 𝐱i ∈ Xi , where Xi is a closed convex set, can be incorporated in the optimization problem by letting hi be the indicator function of Xi . That is, hi (𝐱i ) = ∞ when 𝐱i ∉ Xi . The problem (1.1)–(1.2) has attracted extensive research due to its importance in areas such as image processing, statistical learning, and compressed sensing. See the recent survey [2] and its references. It is assumed that there exists a solution 𝐱∗ to (1.1)–(1.2) and an associated Lagrange multiplier 𝝀∗ ∈ ℝN such that the following first-order optimality conditions hold: 𝐀𝐱∗ = 𝐛 and for i = 1, 2, … , m and for all 𝐮 ∈ ℝni , we have
⟨∇fi (𝐱i∗ ) + 𝐀𝖳i 𝝀∗ , 𝐮 − 𝐱i∗ ⟩ + hi (𝐮) ≥ hi (𝐱i∗ ),
(1.3)
where ∇ denotes the gradient. A popular strategy for solving (1.1)–(1.2) is the a
Data Loading...