Dynamic programming: Continuoustime optimal control; Hamilton-Jacobi-Bellman equation; High-order necessary conditions f
- PDF / 6,052,237 Bytes
- 68 Pages / 594 x 828 pts Page_size
- 10 Downloads / 201 Views
EQUA-
Even though dynamic programming [2] was originally developed for the solution of problems which exhibit discrete types of decisions, it has also been applied to continuous formulations. In this article, the application of dynamic programming to the solution of continuous-time optimal control problems is discussed. By discretizing the problem, applying the dynamic programming equations, then returning to the continuous domain, a partial differential equation results, the Hamilton-JacobiBellman equation (HJB equation). This equation is often referred to as the continuous-time equivaleut o~ the dynamic programming algorithm. In this article, the HJB equation will first be derived. A simple application will be presented, in addition to its use in solving the linear quadratic control problem. Finally, a brief overview of some solution methods and applications presented in the literature will be given. P r o b l e m F o r m u l a t i o n . The dynamic programming approach will be applied to a system of the following form: - /(z(t), z(0) - z0,
(1) 0 _< t _< T,
where z(t) E R ~ is the state vector at time t with time derivative given by $(t), u(t) E U C R m is the control vector at time t, U is the set of control constraints, and T is the terminal time. The function f ( z ( t ) , u(t)) is continuously differentiable with respect to z and continuous with respect to u. The set of admissible control trajecto-
ries are given by the piecewise constant functions, { u ( t ) ' u ( t ) e U, Vt e [0, T]}. It is assumed that for any admissible control trajectory, that a state trajectory zU(t) exists and is unique. The objective is to determine a control trajectory and the corresponding state trajectory which minimizes a cost function of the form: T t r b
h(zU(T)) + / g(z~(t), u(t)) dt,
(2)
. 1
0
where the functions g, and h are continuously differentiable with respect to both z and u. D e r i v a t i o n . The derivation of the HamiltonJacobi-Bellman equation is taken from [3]. The time horizon is first discretized into N equally spaced intervals with:
5_ T N Also, the state and control are represented by: Zk = Z(kS),
k = O, . . . , N,
uk = u ( k S ) ,
k = 0,...,N.
The continuous-time system is approximated by: zk+l = zk + f ( z k , uk)5.
The cost function is rewritten as: N-1
h(z
) +
g(z , k=0
The dynamic programming algorithm is now applied with the following definitions: • J* (t, z) is the optimal cost-to-go for the continuous problem;
Hamilton-Jacobi-Bellman equation A
The dynamic programming equations then take the form: A
J* (NS, z) - h(z),
(3)
J*(kS, z)
(4)
= min [g(z u)5 + J*((k + 1)5,2 + f ( z u)5)] uEU
'
'
'
k=O,...,N-1. A
It is assumed that J*(t, z) has the necessary differentiability requirements to write the following Taylor series expansion:
J*((k + 1)5, z + f(z, u)5)
(5)
uEU
+VtV(t,z) + vxvT(t,z)f(z, u)], V(T,z) = h(z),
J* (kS, z) - min [g(z u)5
Vz, t.
E x a m p l e . Consider the simple dynamic system: = u(t)
with the control bounded by u(t) e [-1, 1] and time over the ran
Data Loading...