Reinforcement learning and adaptive optimization of a class of Markov jump systems with completely unknown dynamic infor

PDF / 1,103,083 Bytes
10 Pages / 595.276 x 790.866 pts Page_size
72 Downloads / 175 Views

(0123456789().,-volV)(0123456789(). ,- volV)

EXTREME LEARNING MACHINE AND DEEP LEARNING NETWORKS

Reinforcement learning and adaptive optimization of a class of Markov jump systems with completely unknown dynamic information Shuping He1,2

•

Maoguang Zhang1 • Haiyang Fang1 • Fei Liu3 • Xiaoli Luan3 • Zhengtao Ding4

Received: 29 November 2018 / Accepted: 29 March 2019 Ó Springer-Verlag London Ltd., part of Springer Nature 2019

Abstract In this paper, an online adaptive optimal control problem of a class of continuous-time Markov jump linear systems (MJLSs) is investigated by using a parallel reinforcement learning (RL) algorithm with completely unknown dynamics. Before collecting and learning the subsystems information of states and inputs, the exploration noise is firstly added to describe the actual control input. Then, a novel parallel RL algorithm is used to parallelly compute the corresponding N coupled algebraic Riccati equations by online learning. By this algorithm, we will not need to know the dynamic information of the MJLSs. The convergence of the proposed algorithm is also proved. Finally, the effectiveness and applicability of this novel algorithm is illustrated by two simulation examples. Keywords Markov jump linear systems (MJLSs) Adaptive optimal control Online Reinforcement learning (RL) Coupled algebraic Riccati equations (AREs)

1 Introduction Markov jump linear systems (MJLSs), firstly proposed by Krasovskii and Lidskii [1] in 1961, can be considered as a kind of multi-model stochastic systems. In MJLSs, it & Shuping He [email protected] Haiyang Fang [email protected] Fei Liu [email protected] Xiaoli Luan [email protected] Zhengtao Ding [email protected] 1

School of Electrical Engineering and Automation, Anhui University, Hefei 230601, China

2

Institute of Physical Science and Information Technology, Anhui University, Hefei 230601, China

3

Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Institute of Automation, Jiangnan University, Wuxi 214122, China

4

School of Electrical and Electronic Engineering, The University of Manchester, Manchester M13 9PL, UK

contains two mechanisms, i.e., the modes and the states. The modes are jumping dynamics, modeled by finite-state Markov chains. The states are continuous or discrete, modeled by a set of differential or difference equations. With the development of control science and stochastic theory, MJLSs have been widely concerned and many research results are available, such as stochastic stability and stabilizability [2–4], controllability [5–9] and robust estimation and filtering [10–12]. In recent years, the adaptive optimal control problem has become a focused issue in controllers design and many related works have been published. For example, the authors in [13] studied the adaptive surface optimal control methods for strict-feedback systems. Then, the observerbased adaptive fuzzy control law was proposed for nonlinear nonstrict-feedback systems [14]. A general method

Data Loading...

Reinforcement learning and adaptive optimization of a class of Markov jump systems with completely unknown dynamic infor

Recommend Documents

A New Class of Particle Filters for Random Dynamic Systems with Unknown Statistics

Erratum to A New Class of Particle Filters for Random Dynamic Systems with Unknown Statistics

Adaptive Reinforcement Learning Strategy with Sliding Mode Control for Unknown and Disturbed Wheeled Inverted Pendulum

Asynchronous Control for Positive Markov Jump Systems

Reinforcement Learning for Adaptive Dialogue Systems A Data-driven M

Markov Switching Quantile Regression with Unknown Quantile \(\tau \) Using a Generalized Class of Skewed Distributions:

Reachable set bounding for a class of bidirectional associative memory NNSs with Markov jump switching parameters

Adaptive Representations for Reinforcement Learning

Adaptive Fuzzy Fault-Tolerant Control Using Nussbaum Gain for a Class of SISO Nonlinear Systems with Unknown Directions

Dynamic Simulated Annealing with Adaptive Neighborhood Using Hidden Markov Model

Self-Learning Optimal Control of Nonlinear Systems Adaptive Dynamic

Analysis and Design of Markov Jump Systems with Complex Transition Probabilities