Semi-supervised Learning in Causal and Anticausal Settings
We consider the problem of learning in the case where an underlying causal model can be inferred. Causal knowledge may facilitate some approaches for a given problem, and rule out others. We formulate the hypothesis that semi-supervised learning can help
- PDF / 419,491 Bytes
- 13 Pages / 439.36 x 666.15 pts Page_size
- 53 Downloads / 164 Views
Semi-supervised Learning in Causal and Anticausal Settings Bernhard Schölkopf, Dominik Janzing, Jonas Peters, Eleni Sgouritsa, Kun Zhang, and Joris Mooij
Abstract We consider the problem of learning in the case where an underlying causal model can be inferred. Causal knowledge may facilitate some approaches for a given problem, and rule out others. We formulate the hypothesis that semisupervised learning can help in an anti-causal setting, but not in a causal setting, and corroborate it with empirical results.
13.1 Introduction Es gibt keinen gefährlicheren Irrtum, als die Folge mit der Ursache zu verwechseln: ich heiße ihn die eigentliche Verderbnis der Vernunft.1 Friedrich Nietzsche, Götzen-Dämmerung
It has been argued that statistical dependencies are always due to underlying causal structures [12]. Machine learning has been very successful in exploiting these dependencies [19]. However, could it also benefit from knowledge of the underlying causal structures? We assay this in the simplest possible setting, where the causal structure only consists of cause and effect, with a focus on the case of semisupervised learning. This follows the work presented at the Festschrift symposium,
1
There is no more dangerous mistake than confusing cause and effect: I call it the actual corruption of reason.
B. Schölkopf () D. Janzing J. Peters E. Sgouritsa K. Zhang Max Planck Institute for Intelligent Systems, Spemannstrasse, 72076 Tübingen, Germany e-mail: [email protected]; [email protected]; [email protected]; [email protected]; [email protected] J. Mooij Institute for Computing & Information Sciences, Radboud University, Nijmegen, Netherlands e-mail: [email protected] B. Schölkopf et al. (eds.), Empirical Inference, DOI 10.1007/978-3-642-41136-6__13, © Springer-Verlag Berlin Heidelberg 2013
129
130
B. Schölkopf et al.
and it draws heavily from a conference paper published since then [13]. The latter provides less detail on the experiments for the case of semi-supervised learning, but it discusses the cases of covariate shift and transfer learning; see also [18]. Pearl and Bareinboim [11] introduce a variable S that labels different domains or datasets and explains how the way in which S is causally linked to variables of interest is relevant for transferring causal or statistical statements across domains. Its authors’ notion of transportability employs conditional independencies to express invariance of mechanisms. The paper [13] discusses a type of invariance where the function in a structural equation remains the same, but the distribution of the noise changes across datasets. Finally, note that the issue is also related to the distinction between generative and discriminative learning; see, for instance, [15].
13.2 Causal Inference We briefly summarize some aspects of causal graphical models as pioneered by Pearl [10] and Spirtes et al. [17]. These are usually thought of as joint probability distributions over a set of variables X1 ; : : : ; Xn , along
Data Loading...