SBC++ Simulator Team

In this article the SBC++ simulator team and its overall design and scientific goals are described. Our main goal in this project is to achieve new methods of machine learning and deciding; also multi-agent behaviors such as strategy setting and cooperati

  • PDF / 127,996 Bytes
  • 4 Pages / 451 x 677.3 pts Page_size
  • 98 Downloads / 185 Views

DOWNLOAD

REPORT


Electrical and Computer Engineering Faculty Shahid Beheshti University

1

Introduction

In this article the SBC++ simulator team and its overall design and scienti c goals are described. Our main goal in this project is to achieve new methods of machine learning and deciding; also multi-agent behaviors such as strategy setting and cooperative learning with insistence on behavior networks. To achieve this, SBC++ uses MASM[1] model, designed by Maes, as its fundamental base. Our team uses the CMUnited-99[4] low level code with little modi cation. Major modi cation and improvements are in the high level code.

2

Team Development

Team Leader:

Eslam Nazemi Iran { Faculty member { Attended the competition {

Team Members:

Amin Abbaspour, Mahmood Rahmani, Bahman Radjabalipour Iran { Undergraduate student { Attended the competition {

Web page

http://www.sbu.ac.ir/sbc/

3

REASM behavior network model

MASM(Maes Action Select Mechanism) was rst introduced in 1989 by Maes[1]. in 1999, Dorer[2] introduced a revised and expanded version of it called REASM. Ten years of research on MASM resulted in REASM, which showed its abilities in RoboCup 99[3]. Here we have successfully added a new learning method to REASM. To observe this, a brief of what REASM is and how it works might be helpful. More detailed information about REASM and its principles can be found in [2]. P. Stone, T. Balch, and G. Kraetzschmar (Eds.): RoboCup 2000, LNAI 2019, pp. 397-400, 2001. c Springer-Verlag Berlin Heidelberg 2001

398 3.1

Amin Abbaspour et al. REASM in brief

There are three main layers in this model: { { {

Goal layer Competence layer Perception layer

Each layer contains several modules. Functions related to these modules are executed in each step. Perception layer gives information about the world to the competence layer through variable 0 e0 , which shows the executability of choosing each competence module. All competence modules have a related behavior for themselves. Each time a competence module is chosen, its behavior is executed. Competence modules also have a set of parameters as follow:

Æ  

Activation of modules Inhibitions of modules Inertia of modules Activation threshold,  2 (0; a], where a is threshold upper limit Threshold decay

Competence modules are activated and inhibited by goal modules or other competence modules. The following formulas calculate this activation and inhibition for competence module k :

atkgi 0 = :f (igi ; rgt i ):exj atkgi 00 = Æ:f (igi ; rgt i ):exj 1 atkgi 000 = :(atsuccg  (psucc ; s)) i ):exj :(1 atkgi 0000 = Æ:(atconf1 gi ):exj : (psucc ; s) atkgi = abs max(atkgi 0; atkgi 00; atkgi 000 ; atkgi 0000) atk = atk

1

+

Xa n

i=1

t kgi

(1) (2) (3) (4) (5) (6)

where:

n Number of goal modules linked to this competence module exj Expectation of goal module linked to the competence module  (p; s) A triangular function of perceptron p and world-state s f (i; r) A triangular function of importance i and relevance r (x) =1=(1 + exp(k( x))) Goetz[5] formula After calculating all competen