A Unifying Representer Theorem for Inverse Problems and Machine Learning
- PDF / 517,494 Bytes
- 20 Pages / 439.37 x 666.142 pts Page_size
- 107 Downloads / 202 Views
A Unifying Representer Theorem for Inverse Problems and Machine Learning Michael Unser1 Received: 30 November 2019 / Revised: 9 July 2020 / Accepted: 9 July 2020 © The Author(s) 2020
Abstract Regularization addresses the ill-posedness of the training problem in machine learning or the reconstruction of a signal from a limited number of measurements. The method is applicable whenever the problem is formulated as an optimization task. The standard strategy consists in augmenting the original cost functional by an energy that penalizes solutions with undesirable behavior. The effect of regularization is very well understood when the penalty involves a Hilbertian norm. Another popular configuration is the use of an 1 -norm (or some variant thereof) that favors sparse solutions. In this paper, we propose a higher-level formulation of regularization within the context of Banach spaces. We present a general representer theorem that characterizes the solutions of a remarkably broad class of optimization problems. We then use our theorem to retrieve a number of known results in the literature such as the celebrated representer theorem of machine leaning for RKHS, Tikhonov regularization, representer theorems for sparsity promoting functionals, the recovery of spikes, as well as a few new ones. Keywords Convex optimization · Regularization · Representer theorem · Inverse problem · Machine learning · Banach space Mathematics Subject Classification 46N10 · 47A52 · 65J20 · 68T05
Communicated by Tomaso Poggio. The research leading to these results has received funding from the Swiss National Science Foundation under Grant 200020-184646/1.
B 1
Michael Unser [email protected] Biomedical Imaging Group, Ecole polytechnique fédérale de Lausanne (EPFL), Station 17, CH-1015 Lausanne, Switzerland
123
Foundations of Computational Mathematics
1 Introduction A recurrent problem in science and engineering is the reconstruction of a multidimensional signal f : Rd → R from a finite number of (possibly noisy) linear measurements y = (ym ) = ν( f ) ∈ R M , where the operator ν = (νm ) : f → ν( f ) = (ν1 , f , . . . , ν M , f ) symbolizes the linear measurement process. The machinelearning version of the problem is the determination of a function f : Rd → R from a finite number of samples ym = f (x m ) + m where m is a small perturbation term; it is a special case of the former with νm = δ(· − x m ). Since a function that takes values over the continuum is an infinite-dimensional entity, the reconstruction problem is inherently ill-posed. The standard remedy is to impose an additional minimum-energy requirement which, in effect, regularizes the solution. A natural choice of regularization is a smoothness norm associated with some function space X (typically, a Sobolev space), which results in the prototypical formulation of the problem as S = arg min f X s.t. νm , f = ym , m = 1, . . . , M. f ∈X
(1)
An alternative version that is better suited for noisy data is S = arg min
f ∈X
M
p
|ym − νm , f |2 + λ
Data Loading...