A Convex Optimization Approach to Dynamic Programming in Continuous State and Action Spaces

PDF / 684,067 Bytes
25 Pages / 439.37 x 666.142 pts Page_size
44 Downloads / 220 Views

A Convex Optimization Approach to Dynamic Programming in Continuous State and Action Spaces Insoon Yang1 Received: 15 January 2020 / Accepted: 3 September 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract In this paper, a convex optimization-based method is proposed for numerically solving dynamic programs in continuous state and action spaces. The key idea is to approximate the output of the Bellman operator at a particular state by the optimal value of a convex program. The approximate Bellman operator has a computational advantage because it involves a convex optimization problem in the case of control-affine systems and convex costs. Using this feature, we propose a simple dynamic programming algorithm to evaluate the approximate value function at pre-specified grid points by solving convex optimization problems in each iteration. We show that the proposed method approximates the optimal value function with a uniform convergence property in the case of convex optimal value functions. We also propose an interpolation-free design method for a control policy, of which performance converges uniformly to the optimum as the grid resolution becomes finer. When a nonlinear control-affine system is considered, the convex optimization approach provides an approximate policy with a provable suboptimality bound. For general cases, the proposed convex formulation of dynamic programming operators can be modified as a nonconvex bilevel program, in which the inner problem is a linear program, without losing the uniform convergence properties. Keywords Dynamic programming · Convex optimization · Optimal control · Stochastic control Mathematics Subject Classification 90C39 · 49L20 · 90C25

Communicated by Lars Grüne.

B 1

Insoon Yang [email protected] Department of Electrical and Computer Engineering, Automation Systems Research Institute, Seoul National University, Seoul, South Korea

123

Journal of Optimization Theory and Applications

1 Introduction Dynamic programming (DP) has been one of the most important methods for solving and analyzing sequential decision-making problems in optimal control, dynamic games, and reinforcement learning, among others. Using DP, we can decompose a complicated sequential decision-making problem into multiple tractable subproblems, of which optimal solutions are used to construct an optimal policy of the original problem. Numerical methods for DP are well studied for discrete-time Markov decision processes (MDPs) with discrete state and action spaces (e.g., [1]) and continuous-time deterministic and stochastic optimal control problems in continuous state spaces (e.g., [2]). This paper focuses on the discrete-time case with continuous state and action spaces in the finite-horizon setting. Unlike infinite-dimensional linear programming (LP) methods (e.g., [3–5]), which require a finite-dimensional approximation of the LP problems, we develop a finite-dimensional convex optimization-based method, which uses a discretization of the state space, while not disc

Data Loading...

A Convex Optimization Approach to Dynamic Programming in Continuous State and Action Spaces

Recommend Documents

Optimization of Decision Rules Based on Dynamic Programming Approach

Pareto optimization in algebraic dynamic programming

Duality and Convex Programming

Relative Optimization of Continuous-Time and Continuous-State Stochastic Systems

Ergodic Approach to Robust Optimization and Infinite Programming Problems

Locally Convex Spaces

An Iterative Convex Programming Method for Rocket Landing Trajectory Optimization

Locally Convex Spaces

Convex-programming problem

A Fast Continuous Max-Flow Approach to Non-convex Multi-labeling Problems

Control Optimization with Stochastic Dynamic Programming

Convex and Stochastic Optimization