Problems of the Hegelian Dialectic Dialectic Reconstructed as a Logi

In this book, I deal with some fundamental problems of the Hegelian dialectic. For this purpose, I take a middle course between total scepticism, which considers dialectic as a devastator sophistry with no respect even for the non-contradiction principle,

  • PDF / 3,044,839 Bytes
  • 34 Pages / 439.37 x 666.142 pts Page_size
  • 38 Downloads / 189 Views

DOWNLOAD

REPORT


We use reinforcement learning (RL) to evolve soccer team strategies. RL may profit significantly from world models (WMs). In high-dimensional, continuous input spaces, however, learning accurate WMs is intractable. In this chapter, we show that incomplete WMs can help to quickly find good policies. Our approach is based on a novel combination of CMACs and prioritized sweeping. Variants thereof outperform other algorithms used in previous work.

1

Introduction

Game playing programs have been a major focus of artificial intelligence (AI) research. How to represent and evaluate positions? How to use planning for exploiting evaluations to select the optimal next move (action)? Berliner's non-adaptive backgammon program (1977) had a prewired evaluation function (EF), costed many man-years of programming effort, but achieved only mediocre level of play. Tesauro's TD-Gammon program (1992) , however, used reinforcement learning (RL) to learn the backgammon EF by playing against itself. After only three months of training on a RS6000, TD-Gammon played at human expert level. Back in 1959, Samuel already constructed a RL program which learned an EF for the game of checkers , resulting in the first game playing program that defeated its own programmer. Related efforts are described by Baxter (chess, 1997), Thrun (chess, 1995) and Schraudolph (Go, 1994) . Soccer. We apply RL to a game quite different from board games: soccer. It involves multiple interacting agents and ambiguous inputs. We are

N. Baba et al. (eds.), Computational Intelligence in Games © Physica-Verlag Heidelberg 2001

100

Chapter 5

partly motivated by the popularity of the International Soccer Robocup. Most early research efforts in the field concentrated on implementing detailed behaviors exploiting the official Robocup soccer simulator. But recently machine learning (ML) and RL in particular have been used to improve soccer teams [36] - mostly to improve cooperation between players and to construct high-level strategies. The Robocup simulator, however, is too complex to evaluate and compare different RL methods for soccer teams learning from scratch, without prewired tactics and behaviors. Therefore we built our own simulator, which is simpler, faster and easier to comprehend.

Learning to play soccer. Our goal is to build teams of autonomous agents that learn to play soccer from very sparse reinforcement signals: only scoring a goal yields reward for the successful team. Team members try to maximize reward by improving their adaptive decision policy mapping (virtual) sensory inputs to actions. In principle there are at least two types of learning algorithms applicable to such problems: reinforcement learning (RL), e.g., [29], [37], [39], and [43], and evolutionary approaches, e.g., [9], [13], [22], [24], and [25]. Here we describe a novel RL method and compare its results to those obtained by previous RL methods and an evolutionary approach. Most existing RL algorithms are based on function approximators (FAs) learning value functions (VFs) that map st