Action Discovery and Intrinsic Motivation: A Biologically Constrained Formalisation

We introduce a biologically motivated, formal framework or “ontology” for dealing with many aspects of action discovery which we argue is an example of intrinsically motivated behaviour (as such, this chapter is a companion to that by Redgrave et al. in t

  • PDF / 648,188 Bytes
  • 31 Pages / 439.36 x 666.15 pts Page_size
  • 63 Downloads / 197 Views

DOWNLOAD

REPORT


Abstract We introduce a biologically motivated, formal framework or “ontology” for dealing with many aspects of action discovery which we argue is an example of intrinsically motivated behaviour (as such, this chapter is a companion to that by Redgrave et al. in this volume). We argue that action discovery requires an interplay between separate internal forward models of prediction and inverse models mapping outcomes to actions. The process of learning actions is driven by transient changes in the animal’s policy (repetition bias) which is, in turn, a result of unpredicted, phasic sensory information (“surprise”). The notion of salience as value is introduced and broken down into contributions from novelty (or surprise), immediate reward acquisition, or general task/goal attainment. Many other aspects of biological action discovery emerge naturally in our framework which aims to guide future modelling efforts in this domain.

1 Introduction As described in detail elsewhere in this volume, there are several reasons why behaviour can be described as “intrinsically motivated” and why intrinsically motivated behaviour is useful. Common to many accounts is the idea that intrinsically motivated behaviour allows us to gain competence in achieving goals in

K. Gurney ()  N. Lepora  A. Shah  P. Redgrave Adaptive Behaviour Research Group, Department of Psychology, University of Sheffield, Sheffield, UK e-mail: [email protected]; [email protected]; [email protected]; [email protected] A. Koene Laboratory for Integrated Theoretical Neuroscience, RIKEN Brain Science Institute, Saitama, Japan e-mail: [email protected] G. Baldassarre and M. Mirolli (eds.), Intrinsically Motivated Learning in Natural and Artificial Systems, DOI 10.1007/978-3-642-32375-1 7, © Springer-Verlag Berlin Heidelberg 2013

151

152

K. Gurney et al.

an environment by developing skills for, and knowledge of, our interaction with that environment (see, e.g. Barto et al. 2004). In addition, intrinsically motivated behaviour of this kind usually results in the development of internal models of the action-outcome causality or “know-how” (Oudeyer and Kaplan 2007). Such competences allow us to accomplish subsequent tasks and goals more effectively. In this chapter we focus on how intrinsic motivation helps an animal determine action-outcome causality. Recently we have developed the first steps in a biologically plausible account of this process (Redgrave and Gurney 2006; Redgrave et al. 2008). These ideas are also described in Redgrave et al. (2012) and summarised in Sect. 2. The focus of that work was on an analysis of the physiological and anatomical evidence that implicates short latency phasic (transient) changes in the levels of the neurotransmitter dopamine in learning causality and, in particular, its role as a signal of sensory prediction error. In the tradition of Marr and Poggio (1976), we have therefore proposed a computational rationale for phasic dopamine. Thus, in brief, phasic dopamine causes the animal to repea