Event-triggered adaptive dynamic programming for multi-player zero-sum games with unknown dynamics

  • PDF / 2,039,698 Bytes
  • 15 Pages / 595.276 x 790.866 pts Page_size
  • 36 Downloads / 198 Views

DOWNLOAD

REPORT


METHODOLOGIES AND APPLICATION

Event-triggered adaptive dynamic programming for multi-player zero-sum games with unknown dynamics Yongwei Zhang1 · Bo Zhao2

· Derong Liu1

© Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract In this paper, a novel event-triggered optimal control approach is developed to solve zero-sum game problems for continuoustime multi-player nonlinear systems with unknown dynamics. To begin with, a model neural network (NN) is employed to reconstruct the unknown multi-player nonlinear system by measured input and output data. Then, a critic NN is used to solve the event-triggered Hamilton–Jacobi–Isaacs (HJI) equation for multi-player zero-sum game. Meanwhile, the optimal control law and the worst disturbance law are approximated with the help of critic NN only, respectively. Compared with time-triggered method, the developed control law and the disturbance law are updated only when the triggering condition is violated; thus, the computational and communication burden are reduced. The Lyapunov stability analysis shows that the closed-loop system can be guaranteed to be stable. Finally, two simulation examples are provided to validate the effectiveness of the proposed method. Keywords Adaptive dynamic programming · Event-triggered control · Neural network · Multi-player zero-sum game

1 Introduction In real industrial applications, such as electric power systems, communication networks, aircrafts, and manufacturing systems, control systems always consist of more than one controller, where each of them can be regarded as a player with an individual policy and operate in group with a general quadratic performance index function as a game (Jiang and Zhang 2018). Therefore, multi-player game problems, which can be divided into two categories, i.e., zero-sum game (ZSG) and nonzerosum game (NZSG), have attracted much attention. For multi-player game problems, it is necessary to solve Hamilton–Jacobi (HJ) or Hamilton–Jacobi–Isaacs Communicated by V. Loia.

B

Bo Zhao [email protected] Yongwei Zhang [email protected] Derong Liu [email protected]

1

School of Automation, Guangdong University of Technology, Guangzhou 510006, China

2

School of Systems Science, Beijing Normal University, Beijing 100875, China

(HJI) equations, which are difficult or impossible in high nonlinear case (Liu et al. 2017). Hence, many scholars have proposed different methods to address these problems (Aliyu 2018; Zhu and Zhao 2015). As one of the effective methods, adaptive dynamic programming (ADP) is employed to solve nonlinear HJ or HJI equations (Zhu and Zhao 2018; Wei et al. 2016, 2017a; Liu et al. 2018; Zhao et al. 2018a, b; Wang et al. 2019). For NZSG problems, a policy iteration (PI) method was developed for continuous-time (CT) nonlinear systems (Zhao et al. 2016), where the system function was approximated by NN and a novel weight tuning law was proposed by using experience replay technique. In Liu et al. (2014), an online synchronous PI algorithm was proposed to deal with the multi-