Multi-agent graphical games with input constraints: an online learning solution
- PDF / 389,281 Bytes
- 12 Pages / 595.28 x 841.89 pts (A4) Page_size
- 98 Downloads / 176 Views
, No. , pp.
Control Theory and Technology http://link.springer.com/journal/11768
Multi-agent graphical games with input constraints: an online learning solution Tianxiang WANG, Bingchang WANG† , Yong LIANG School of Control Science and Engineering, Shandong University, Jinan Shandong 250061, China Received 17 February 2020; revised 25 April 2020; accepted 26 April 2020
Abstract This paper studies an online iterative algorithm for solving discrete-time multi-agent dynamic graphical games with input constraints. In order to obtain the optimal strategy of each agent, it is necessary to solve a set of coupled Hamilton-Jacobi-Bellman (HJB) equations. It is very difficult to solve HJB equations by the traditional method. The relevant game problem will become more complex if the control input of each agent in the dynamic graphical game is constrained. In this paper, an online iterative algorithm is proposed to find the online solution to dynamic graphical game without the need for drift dynamics of agents. Actually, this algorithm is to find the optimal solution of Bellman equations online. This solution employs a distributed policy iteration process, using only the local information available to each agent. It can be proved that under certain conditions, when each agent updates its own strategy simultaneously, the whole multi-agent system will reach Nash equilibrium. In the process of algorithm implementation, for each agent, two layers of neural networks are used to fit the value function and control strategy, respectively. Finally, a simulation example is given to show the effectiveness of our method. Keywords: Actor-critic algorithm, differential games, input constraints, neural network (NN), reinforcement learning (RL) DOI https://doi.org/10.1007/s11768-020-0013-6
1
Introduction
In recent years, distributed cooperative control for multi-agent systems has attracted wide attention because of its wide application in the fields of computer science, aircraft, unmanned, mobile robot, sensor network and so on [1]. For the problem of leadless multi-agent
consensus, by applying distributed control strategies, all agents are finally synchronized to an uncontrollable common value, which is determined by the initial states of all agents. In [2], the authors studied a multi-agent mean field game with multiplicative noise, obtaining a set of decentralized strategies by solving an auxiliary limiting optimal control problem, and showed all agents
† Corresponding author. E-mail: [email protected]. This work was supported by the National Natural Science Foundation of China (Nos. 61773241, 61973183) and the Shandong Provincial Natural Science Foundation (No. ZR2019MF041).
© 2020 South China University of Technology, Academy of Mathematics and Systems Science, CAS and Springer-Verlag GmbH Germany, part of Springer Nature
T. Wang et al. / Control Theory Tech, Vol.
eventually achieve mean-square consensus under the mild conditions. In the cooperative tracking consensus problem, each agent can synchronize to the trajectory of t
Data Loading...