Parseval Proximal Neural Networks

  • PDF / 948,694 Bytes
  • 31 Pages / 439.37 x 666.142 pts Page_size
  • 56 Downloads / 209 Views

DOWNLOAD

REPORT


(2020) 26:59

Parseval Proximal Neural Networks Marzieh Hasannasab1 · Johannes Hertrich1 · Sebastian Neumayer1 · Gerlind Plonka2 · Simon Setzer3 · Gabriele Steidl1 Received: 20 December 2019 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract The aim of this paper is twofold. First, we show that a certain concatenation of a proximity operator with an affine operator is again a proximity operator on a suitable Hilbert space. Second, we use our findings to establish so-called proximal neural networks (PNNs) and stable tight frame proximal neural networks. Let H and K be real Hilbert spaces, b ∈ K and T ∈ B(H, K) a linear operator with closed range and Moore–Penrose inverse T † . Based on the well-known characterization of proximity operators by Moreau, we prove that for any proximity operator Prox : K → K the operator T † Prox(T · +b) is a proximity operator on H equipped with a suitable norm. In particular, it follows for the frequently applied soft shrinkage operator Prox = Sλ : 2 → 2 and any frame analysis operator T : H → 2 that the frame shrinkage operator T † Sλ T is a proximity operator on a suitable Hilbert space. The concatenation of proximity operators on Rd equipped with different norms establishes a PNN. If the network arises from tight frame analysis or synthesis operators, then it forms an averaged operator. In particular, it has Lipschitz constant 1 and belongs to the class of so-called Lipschitz networks, which were recently applied to defend against adversarial attacks. Moreover, due to its averaging property, PNNs can be used within so-called Plug-and-Play algorithms with convergence guarantee. In case of Parseval frames, we call the networks Parseval proximal neural networks (PPNNs). Then, the involved linear operators are in a Stiefel manifold and corresponding minimization methods can be applied for training of such networks. Finally, some proof-of-the concept examples demonstrate the performance of PPNNs. Keywords Lipschitz neural networks · Averaged operators · Proximal operators · Frame shrinkage · Adverserial robustness · Optimization on Stiefel manifolds Mathematics Subject Classification 68T07 · 90C26

Communicated by Holger Rauhut. Extended author information available on the last page of the article 0123456789().: V,-vol

59

Page 2 of 31

Journal of Fourier Analysis and Applications

(2020) 26:59

1 Introduction Wavelet and frame shrinkage operators became very popular in recent years. A certain starting point was the iterative shrinkage-thresholding algorithm (ISTA) in [16], which was interpreted as a special case of the forward-backward algorithm in [14]. For relations with other algorithms see also [8,40]. Let T ∈ Rn×d , n ≥ d, have full column rank. Then, the problem argmin y∈Rd

1

2 x

 − y22 + λT y1 , λ > 0,

(1)

is known as the analysis point of view. For orthogonal T ∈ Rd×d , the solution of (1) is given by the frame soft shrinkage operator T † Sλ T = T ∗ Sλ T , see Example 2.3. If T ∈ Rn×d with n ≤ d and T T ∗ = In , the solution of problem