Incremental predictive clustering trees for online semi-supervised multi-target regression

PDF / 872,794 Bytes
19 Pages / 439.37 x 666.142 pts Page_size
94 Downloads / 254 Views

Incremental predictive clustering trees for online semi‑supervised multi‑target regression Aljaž Osojnik1 · Panče Panov2 · Sašo Džeroski1 Received: 16 July 2019 / Revised: 8 July 2020 / Accepted: 19 September 2020 © The Author(s) 2020

Abstract In many application settings, labeling data examples is a costly endeavor, while unlabeled examples are abundant and cheap to produce. Labeling examples can be particularly problematic in an online setting, where there can be arbitrarily many examples that arrive at high frequencies. It is also problematic when we need to predict complex values (e.g., multiple real values), a task that has started receiving considerable attention, but mostly in the batch setting. In this paper, we propose a method for online semi-supervised multi-target regression. It is based on incremental trees for multi-target regression and the predictive clustering framework. Furthermore, it utilizes unlabeled examples to improve its predictive performance as compared to using just the labeled examples. We compare the proposed iSOUP-PCT method with supervised tree methods, which do not use unlabeled examples, and to an oracle method, which uses unlabeled examples as though they were labeled. Additionally, we compare the proposed method to the available state-of-the-art methods. The method achieves good predictive performance on account of increased consumption of computational resources as compared to its supervised variant. The proposed method also beats the state-of-the-art in the case of very few labeled examples in terms of performance, while achieving comparable performance when the labeled examples are more common. Keywords Multi-target regression · Data stream mining · Semi-supervised learning · Predictive clustering

Editors: Larisa Soldatova, Joaquin Vanschoren. * Aljaž Osojnik [email protected] Panče Panov [email protected] Sašo Džeroski [email protected] 1

Jožef Stefan Institute, Jamova cesta 39, Ljubljana, Slovenia

2

Jožef Stefan International Postgraduate School, Jožef Stefan Institute, Jamova cesta 39, Ljubljana, Slovenia

13

Vol.:(0123456789)

Machine Learning

1 Introduction Recently, there has been lot of interest in the research community to develop methods for prediction of complex values. One such predictive learning task is the task of multi-target regression (MTR), where we want to predict multiple continuous values, called targets, at the same time. The targets are assumed to be related, but equally important. Methods for MTR can be used directly to produce predictive models or they can be utilized by more complex systems, e.g., in recommender systems. Methods for MTR are fairly common in the regular batch learning setting, but rarer in the online learning setting. In the batch learning setting, the entire dataset is available at the start of the learning process and the order of the examples in the dataset is generally assumed not to have an impact on the learning process. In online learning, the entire dataset is not available at the start of the learning

Data Loading...

Incremental predictive clustering trees for online semi-supervised multi-target regression

Recommend Documents

Multivariate Predictive Clustering Trees for Classification

A Gaussian Process-Based Incremental Neural Network for Online Regression

Regression Trees

Ensembles of extremely randomized predictive clustering trees for predicting structured outputs

Online Bayesian shrinkage regression

Regularized and incremental decision trees for data streams

Cascaded Continuous Regression for Real-Time Incremental Face Tracking

Inverse partitioned matrix-based semi-random incremental ELM for regression

Least Squares Approach for Multivariate Split Selection in Regression Trees

BETULA: Numerically Stable CF-Trees for BIRCH Clustering

Bayesian Spatial Regression for Multi-source Predictive Mapping

Online Implementation of Cascade Predictive PI Control for Nonlinear Processes