Conditional variance penalties and domain shift robustness

  • PDF / 6,151,304 Bytes
  • 46 Pages / 439.37 x 666.142 pts Page_size
  • 29 Downloads / 208 Views

DOWNLOAD

REPORT


Conditional variance penalties and domain shift robustness Christina Heinze‑Deml1   · Nicolai Meinshausen1 Received: 15 July 2019 / Revised: 1 October 2020 / Accepted: 15 October 2020 © The Author(s) 2020

Abstract When training a deep neural network for image classification, one can broadly distinguish between two types of latent features of images that will drive the classification. We can divide latent features into (i) ‘core’ or ‘conditionally invariant’ features C whose distribution C|Y  , conditional on the class Y, does not change substantially across domains and (ii) ‘style’ features S whose distribution S|Y can change substantially across domains. Examples for style features include position, rotation, image quality or brightness but also more complex ones like hair color, image quality or posture for images of persons. Our goal is to minimize a loss that is robust under changes in the distribution of these style features. In contrast to previous work, we assume that the domain itself is not observed and hence a latent variable. We do assume that we can sometimes observe a typically discrete identifier or “ ID variable”. In some applications we know, for example, that two images show the same person, and ID then refers to the identity of the person. The proposed method requires only a small fraction of images to have ID information. We group observations if they share the same class and identifier (Y, ID) = (y, id) and penalize the conditional variance of the prediction or the loss if we condition on (Y, ID) . Using a causal framework, this conditional variance regularization (CoRe) is shown to protect asymptotically against shifts in the distribution of the style variables in a partially linear structural equation model. Empirically, we show that the CoRe penalty improves predictive accuracy substantially in settings where domain changes occur in terms of image quality, brightness and color while we also look at more complex changes such as changes in movement and posture. Keywords  Domain shift · Dataset shift · Causal models · Distributional robustness · Anticausal prediction · Image classification

Editor: Paolo Frasconi. * Christina Heinze‑Deml [email protected] Nicolai Meinshausen [email protected] 1



Seminar for Statistics, ETH Zurich, Zurich, Switzerland

13

Vol.:(0123456789)



Machine Learning

1 Introduction Deep neural networks (DNNs) have achieved outstanding performance on prediction tasks like visual object and speech recognition (Krizhevsky et al. 2012; Szegedy et al. 2015; He et al. 2015). Issues can arise when the learned representations rely on dependencies that vanish in test distributions (see for example Quionero-Candela et al. (2009),Torralba and Efros (2011), Csurka (2017) and references therein). Such domain shifts can be caused by changing conditions such as color, background or location changes. Predictive performance is then likely to degrade. For example, consider the analysis presented in Kuehlkamp et al. (2017) which is concerned with the problem o