Regression with Linear Factored Functions

Many applications that use empirically estimated functions face a curse of dimensionality, because integrals over most function classes must be approximated by sampling. This paper introduces a novel regression-algorithm that learns linear factored functi

PDF / 753,992 Bytes
16 Pages / 439.37 x 666.142 pts Page_size
67 Downloads / 276 Views

DOWNLOAD

REPORT

Abstract. Many applications that use empirically estimated functions face a curse of dimensionality, because integrals over most function classes must be approximated by sampling. This paper introduces a novel regression-algorithm that learns linear factored functions (LFF). This class of functions has structural properties that allow to analytically solve certain integrals and to calculate point-wise products. Applications like belief propagation and reinforcement learning can exploit these properties to break the curse and speed up computation. We derive a regularized greedy optimization scheme, that learns factored basis functions during training. The novel regression algorithm performs competitively to Gaussian processes on benchmark tasks, and the learned LFF functions are with 4-9 factored basis functions on average very compact. Keywords: Regression

1

· Factored functions · Curse of dimensionality

Introduction

This paper introduces a novel regression-algorithm, which performs competitive to Gaussian processes, but yields linear factored functions (LFF). These have outstanding properties like analytical point-wise products and marginalization. Regression is a well known problem, which can be solved by many non-linear architectures like kernel methods (Shawe-Taylor and Cristianini 2004) or neural networks (Haykin 1998). While these perform well, the estimated functions often suﬀer a curse of dimensionality in later applications. For example, computing an integral over a neural network or kernel function requires to sample the entire input space. Applications like belief propagation (Pearl 1988) and reinforcement learning (Kaelbling et al. 1996), on the other hand, face large input spaces and require therefore eﬃcient computations. We propose LFF for this purpose and showcase its properties in comparison to kernel functions. 1.1

Kernel Regression

In the last 20 years, kernel methods like support vector machines (SVM, Boser et al. 1992; Vapnik 1995) have become a de facto standard in various practical applications. This is mainly due to a sparse representation of the learned classiﬁers with so called support vectors (SV). The most popular kernel method for c Springer International Publishing Switzerland 2015 A. Appice et al. (Eds.): ECML PKDD 2015, Part I, LNAI 9284, pp. 119–134, 2015. DOI: 10.1007/978-3-319-23528-8 8

120

W. B¨ ohmer and K. Obermayer

regression, Gaussian processes (GP, see Bishop 2006; Rasmussen and Williams 2006), on the other hand, requires as many SV as training samples. Sparse versions of GP aim thus for a small subset of SV. Some select this set based on constraints similar to SVM (Tipping 2001; Vapnik 1995), while others try to conserve the spanned linear function space (sparse GP, Csat´o and Opper 2002; Rasmussen and Williams 2006). There exist also attempts to construct new SV by averaging similar training samples (e.g. Wang et al. 2012). Well chosen SV for regression are usually not sparsely concentrated on a decision boundary as they are for SVM. In fact, many practical appli

Data Loading...

Regression with Linear Factored Functions

Recommend Documents

Linear Regression with RStudio

Regression with Linear Predictors

Linear Regression

Linear Regression

Linear Regression

Linear Regression

Linear Regression

Linear Regression

Regression by Special Functions

Multiple Linear Regression

Regression by special functions ISOTONIC REGRESSION PROBLEMS

Log-Linear Regression