Safe Exploration for Active Learning with Gaussian Processes

In this paper, the problem of safe exploration in the active learning context is considered. Safe exploration is especially important for data sampling from technical and industrial systems, e.g. combustion engines and gas turbines, where critical and uns

PDF / 588,966 Bytes
17 Pages / 439.37 x 666.142 pts Page_size
61 Downloads / 215 Views

DOWNLOAD

REPORT

2

Robert Bosch GmbH, 70442 Stuttgart, Germany [email protected] University of Stuttgart, MLR Laboratory, 70569 Stuttgart, Germany

Abstract. In this paper, the problem of safe exploration in the active learning context is considered. Safe exploration is especially important for data sampling from technical and industrial systems, e.g. combustion engines and gas turbines, where critical and unsafe measurements need to be avoided. The objective is to learn data-based regression models from such technical systems using a limited budget of measured, i.e. labelled, points while ensuring that critical regions of the considered systems are avoided during measurements. We propose an approach for learning such models and exploring new data regions based on Gaussian processes (GP’s). In particular, we employ a problem speciﬁc GP classiﬁer to identify safe and unsafe regions, while using a diﬀerential entropy criterion for exploring relevant data regions. A theoretical analysis is shown for the proposed algorithm, where we provide an upper bound for the probability of failure. To demonstrate the eﬃciency and robustness of our safe exploration scheme in the active learning setting, we test the approach on a policy exploration task for the inverse pendulum hold up problem.

1

Introduction

Active learning (AL) deals with the problem of selective and guided generation of labeled data. In the AL setting, an agent guides the data generation process by choosing new informative samples to be labeled based on the knowledge obtained so far. Providing labels for new data points, e.g. image labels as by Lang and Baum [1992] or measurements of the system output in case of physical systems, like by Hans et al. [2008], can be very costly and tedious. The overall goal of AL is to create a data-based model, without having to supply more data than necessary and, thus, reducing the agent annotation eﬀort or the measurements on machines. For regression tasks, the AL concept is sometimes also referred to optimal experimental design, see Fedorov [1972]. In this paper, we consider the problem of safe data selection while jointly learning a data-based regression model on the explored input space. Given failure conditions, the goal is to actively select a budget of measurement points for approximating the model, and keeping the probability of measurement failures c Springer International Publishing Switzerland 2015 A. Bifet et al. (Eds.): ECML PKDD 2015, Part III, LNAI 9286, pp. 133–149, 2015. DOI: 10.1007/978-3-s319-23461-8 9

134

J. Schreiter et al.

to a minimum at the same time. In practice, safe data selection is highly relevant, especially, when measurements are performed on technical systems, e.g. combustion engines and test benches. For such technical systems, it is important to avoid critical points, where the measurements can damage the system. Thus, the main objective is (i) to approximate the system model from sampled data, (ii) using a limited budget of measured points, and (iii) ensuring that critical regions of the consi

Data Loading...

Safe Exploration for Active Learning with Gaussian Processes

Recommend Documents

Skew Gaussian processes for classification

Towards Adaptive System Behavior and Learning Processes for Active Exoskeletons

Multi-Fidelity for MDO Using Gaussian Processes

Gaussian Processes and Model Emulation

Functional Data Clustering Analysis via the Learning of Gaussian Processes with Wasserstein Distance

A Learning Based Approach for Planning with Safe Actions

How to Encode Dynamic Gaussian Bayesian Networks as Gaussian Processes?

Probabilistic Guarantees for Safe Deep Reinforcement Learning

Exploration of Ear Biometrics with Deep Learning

Stable Non-Gaussian Self-Similar Processes with Stationary Increments

Conjunction Probability of Smooth Centered Gaussian Processes

Some Large Deviations Principles for Time-Changed Gaussian Processes