Learning with Kernels and Logical Representations

Choosing an appropriate kernel function is a fundamental step for the application of many popular statistical learning algorithms. Kernels are actually the natural entry point for inserting prior knowledge into the learning process. Inductive logic progra

PDF / 166,400 Bytes
3 Pages / 430 x 660 pts Page_size
1 Downloads / 228 Views

DOWNLOAD

REPORT

Abstract. Choosing an appropriate kernel function is a fundamental step for the application of many popular statistical learning algorithms. Kernels are actually the natural entry point for inserting prior knowledge into the learning process. Inductive logic programming (ILP), on the other hand, oﬀers a powerful and ﬂexible framework for describing existing background knowledge and extracting additional knowledge from the data. It therefore seems natural to explore the synergy between these two important paradigms of machine learning. In this extended abstract (see [1] for a longer version), I brieﬂy review some of our recent work about statistical learning with kernel machines in the ILP setting.

1

Motivations

Statistical and logical approaches to machine learning oﬀer complementary advantages. Logic allows us to represent domain knowledge in a natural and expressive way, and ILP can generate theories and explanations. Statistical learning, on the other hand, allows us to deal with uncertainty and noise in the data. Probabilistic inductive learning programming (PILP), also called statistical relational learning, is a very active area of research and several representational frameworks and models have been proposed during the last few years (see e.g. [2,3] for an overview). It essentially relies on the combined use of logic and probabilities in the learning process. One interesting distinction that is often made in statistical supervised learning is between generative and discriminant classiﬁers. In the former case, we typically attempt to model class conditional densities and use Bayes’ theorem to obtain the conditional probability of the output label given the input. In the latter case, one attempts to model conditional probabilities directly or, even more simply, to learn a discriminant function that consistently approximates the optimal decision function as the number of training examples grows to inﬁnity. Several PILP approaches are based on generative learning. For example, stochastic logic programs are a generalization of probabilistic context free grammars that assign a probability to each deﬁnite clause in a logic program and allow us to infer the probability that a given goal is refuted. The approaches brieﬂy reviewed here take the discriminant direction and exploit classic statistical supervised learning H. Blockeel et al. (Eds.): ILP 2007, LNAI 4894, pp. 1–3, 2008. c Springer-Verlag Berlin Heidelberg 2008

2

P. Frasconi

algorithms based on kernel machines. Although several kernels have been deﬁned on discrete data structures like strings, trees, and graphs, there are several motivations for studying the combination of kernels with logic: – Improving and facilitating kernel design. Background knowledge is usually plugged-in via the kernel function. We can use background knowledge expressed by logic programs and convert it into a kernel, thus embedding it into a statistical learning algorithm in a principled and ﬂexible way. – Improving the accuracy and the eﬃciency of existing ILP systems, for ex

Data Loading...

Learning with Kernels and Logical Representations

Recommend Documents

Pictorial Representations and Learning

Learning flat representations with artificial neural networks

Learning representations from dendrograms

Representations of Solutions of Hyperbolic Volterra Integro-Differential Equations with Singular Kernels

Learning Joint Shape and Appearance Representations with Metamorphic Auto-Encoders

Approximation with polynomial kernels and SVM classifiers

Adaptive Representations for Reinforcement Learning

Learning Disentangled Representations with Attentive Joint Variational Autoencoder

Learning Representations for Automatic Colorization

Knowledge Representation with Logical Systems

Making a Case for Learning Motion Representations with Phase

Spaces, Kernels, and Disintegration