Concentration estimates for learning with unbounded sampling
- PDF / 412,676 Bytes
- 17 Pages / 439.37 x 666.142 pts Page_size
- 21 Downloads / 230 Views
Concentration estimates for learning with unbounded sampling Zheng-Chu Guo · Ding-Xuan Zhou
Received: 17 May 2010 / Accepted: 12 September 2011 / Published online: 11 October 2011 © Springer Science+Business Media, LLC 2011
Abstract The least-square regression problem is considered by regularization schemes in reproducing kernel Hilbert spaces. The learning algorithm is implemented with samples drawn from unbounded sampling processes. The purpose of this paper is to present concentration estimates for the error based on 2 empirical covering numbers, which improves learning rates in the literature. Keywords Learning theory · Least-square regression · Regularization in reproducing kernel Hilbert spaces · Empirical covering number · Concentration estimates Mathematics Subject Classifications (2010) 68T05 · 62J02
Communicated by Lixin Shen. The work described in this paper is supported partially by the Research Grants Council of Hong Kong [Project No. CityU 103508] and National Science Fund for Distinguished Young Scholars of China [Project No. 10529101]. Z.-C. Guo School of Mathematics and Computational Science, Sun Yat-sen University, Guangzhou 510275, People’s Republic of China e-mail: [email protected] Z-C. Guo · D.-X. Zhou (B) Department of Mathematics, City University of Hong Kong, Kowloon, Hong Kong, People’s Republic of China e-mail: [email protected]
208
Z.-C. Guo, D.-X. Zhou
1 Introduction and main results We consider the regression problem by learning with unbounded sampling processes. The target functions for learning are defined on a complete separable metric space X (input space) and take values in Y = R (output space). Learning is implemented by algorithms and samples. In our setting it is m is drawn independently from a assumed that a sample z = {zi = (xi , yi )}i=1 Borel probability measure ρ on Z := X × Y. Our target is the regression function defined by fρ (x) = ydρ(y|x), x ∈ X, Y
where ρ(·|x) is the conditional distribution of ρ at x ∈ X. The learning algorithm studied in this paper is based on a bounded Mercer kernel K : X × X → R which is a continuous, symmetric, and positive semidefinite function. The reproducing kernel Hilbert space (RKHS) H K associated with K is the completion of span{Kx = K(·, x) : x ∈ X} [2] with respect to the inner product ·, · K given by Kx , Kx K = K(x, x ). The learning algorithm considered here is a regularization scheme in H K given by m 1 2 2 fz,λ = arg min (1.1) ( f (xi ) − yi ) + λ f K , f ∈H K m i=1 where λ = λ(m) > 0 is a regularization parameter. There is a large literature on error analysis for algorithm (1.1). See e.g. [8, 10, 12, 15, 23, 28]. Most results are stated under the standard uniform boundedness assumption for the output that for some constant M > 0, |y| ≤ M almost surely. This standard assumption is abandoned in [5, 9, 22]. In [5, 9], it is assumed that the output satisfies the condition |y − fH (x)| 2 |y − fH (x)|2 − −1 dρ(y|x) ≤ exp − , M M 2M2
Y
a.e.
x∈X (1.2)
for some constants M, > 0, where fH is
Data Loading...