Markov Decision Processes With Their Applications

Markov decision processes (MDPs), also called stochastic dynamic programming, were first studied in the 1960s. MDPs can be used to model and solve dynamic decision-making problems that are multi-period and occur in stochastic circumstances. There are thre

  • PDF / 3,845,682 Bytes
  • 305 Pages / 439.37 x 666.142 pts Page_size
  • 92 Downloads / 214 Views

DOWNLOAD

REPORT


Advances in Mechanics and Mathematics VOLUME 14 Series Editor: David Y. Gao Virginia Polytechnic Institute and State University, U.S.A Ray W. Ogden University of Glasgow, U.K. Advisory Editors: I. Ekeland University of British Columbia, Canada S. Liao Shanghai Jiao Tung University, P.R. China K.R. Rajagopal Texas A&M University, U.S.A. T. Ratiu Ecole Polytechnique, Switzerland W. Yang Tsinghua University, P.R. China

MARKOV DECISION PROCESSES WITH THEIR APPLICATIONS By Prof. Ph.D. Qiying Hu Fudan University, China Prof. Ph.D. Wuyi Yue Konan University, Japan

Library of Congress Control Number: 2006930245 ISBN-13: 978-0-387-36950-1

e-ISBN-13: 978-0-387-36951-8

Printed on acid-free paper.

AMS Subject Classifications: 90C40, 90C39, 93C65, 91B26, 90B25

© 2008 Springer Science+Business Media, LLC All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. 987654321 springer.com

Contents

List of Figures List of Tables Preface Acknowledgments

ix xi xiii xv

1. INTRODUCTION 1 A Brief Description of Markov Decision Processes 2 Overview of the Book 3 Organization of the Book

1 1 4 6

2. DISCRETE TIME MARKOV DECISION PROCESSES: TOTAL REWARD 1 Model and Preliminaries 1.1 System Model 1.2 Some Concepts 1.3 Finiteness of the Reward 2 Optimality Equation 2.1 Validity of the Optimality Equation 2.2 Properties of the Optimality Equation 3 Properties of Optimal Policies 4 Successive Approximation 5 Sufficient Conditions 6 Notes and References

11 11 11 12 14 17 17 21 25 30 32 34

3. DISCRETE TIME MARKOV DECISION PROCESSES: AVERAGE CRITERION 1 Model and Preliminaries 2 Optimality Equation

39 39 43

vi

MARKOV DECISION PROCESSES WITH THEIR APPLICATIONS

3

4

2.1 Properties of ACOE and Optimal Policies 2.2 Sufficient Conditions 2.3 Recurrent Conditions Optimality Inequalities 3.1 Conditions 3.2 Properties of ACOI and Optimal Policies Notes and References

44 48 50 53 54 57 60

4. CONTINUOUS TIME MARKOV DECISION PROCESSES 1 A Stationary Model: Total Reward 1.1 Model and Conditions 1.2 Model Decomposition 1.3 Some Properties 1.4 Optimality Equation and Optimal Policies 2 A Nonstationary Model: Total Reward 2.1 Model and Conditions 2.2 Optimality Equation 3 A Stationary Model: Average Criterion 4 Notes and References

63 63 63 67 71 77 85 85 87 95 101

5. SEMI-MARKOV DECISION PROCESSES 1 Model and Conditions 1.1 Model 1.2 Regular Conditions 1.3 Criteria 2 Transformat