Maximizing Fun by Creating Data with Easily Reducible Subjective Complexity

The Formal Theory of Fun and Creativity (1990–2010) [Schmidhuber, J.: Formal theory of creativity, fun, and intrinsic motivation (1990–2010). IEEE Trans. Auton. Mental Dev. 2(3), 230–247 (2010b)] describes principles of a curious and creative agent that n

  • PDF / 491,457 Bytes
  • 34 Pages / 439.36 x 666.15 pts Page_size
  • 43 Downloads / 171 Views

DOWNLOAD

REPORT


Abstract The Formal Theory of Fun and Creativity (1990–2010) [Schmidhuber, J.: Formal theory of creativity, fun, and intrinsic motivation (1990–2010). IEEE Trans. Auton. Mental Dev. 2(3), 230–247 (2010b)] describes principles of a curious and creative agent that never stops generating nontrivial and novel and surprising tasks and data. Two modules are needed: a data encoder and a data creator. The former encodes the growing history of sensory data as the agent is interacting with its environment; the latter executes actions shaping the history. Both learn. The encoder continually tries to encode the created data more efficiently, by discovering new regularities in it. Its learning progress is the wow-effect or fun or intrinsic reward of the creator, which maximizes future expected reward, being motivated to invent skills leading to interesting data that the encoder does not yet know but can easily learn with little computational effort. I have argued that this simple formal principle explains science and art and music and humor.

Note: This overview heavily draws on previous publications since 1990, especially Schmidhuber (2010b), parts of which are reprinted with friendly permission by IEEE.

1 Introduction “All life is problem solving,” wrote Popper (1999). To solve existential problems such as avoiding hunger or heat, a baby has to learn how the initially unknown environment responds to its actions. Even when there is no immediate need to satisfy J. Schmidhuber () IDSIA, Galleria 2, 6928 Manno-Lugano, Switzerland University of Lugano & SUPSI, Manno-Lugano, Switzerland e-mail: [email protected] G. Baldassarre and M. Mirolli (eds.), Intrinsically Motivated Learning in Natural and Artificial Systems, DOI 10.1007/978-3-642-32375-1 5, © Springer-Verlag Berlin Heidelberg 2013

95

96

J. Schmidhuber

thirst or other built-in primitive drives, the baby does not run idle. Instead it actively conducts experiments: what sensory feedback do I get if I move my eyes or my fingers or my tongue just like that? Being able to predict effects of actions will later make it easier to plan control sequences leading to desirable states, such as those where heat and hunger sensors are switched off. The growing infant quickly gets bored by things it already understands well, but also by those it does not understand at all, always searching for new effects exhibiting some yet unexplained but easily learnable regularity. It acquires more and more complex behaviors building on previously acquired, simpler behaviors. Eventually it might become a physicist discovering previously unknown physical laws, or an artist creating new eye-opening artworks, or a comedian coming up with novel jokes. For a long time I have been arguing, using various wordings, that all this behavior is driven by a very simple algorithmic mechanism that uses more or less general reinforcement learning (RL) methods (Hutter 2005; Kaelbling et al. 1996; Schmidhuber 1991d, 2009e; Sutton and Barto 1998) to maximize internal woweffects or fun or intrinsic reward through redu