Risk Sensitive Markov Decision Process for Portfolio Management

In the Portfolio Management problem the agent has to decide how to allocate the resources among a set of stocks in order to maximize his gains. This decision-making problem is modeled by some researchers through Markov decision processes (MDPs) and the mo

  • PDF / 1,332,156 Bytes
  • 13 Pages / 439.37 x 666.142 pts Page_size
  • 9 Downloads / 246 Views

DOWNLOAD

REPORT


Abstract. In the Portfolio Management problem the agent has to decide how to allocate the resources among a set of stocks in order to maximize his gains. This decision-making problem is modeled by some researchers through Markov decision processes (MDPs) and the most widely used criterion in MDPs is maximizing the expected total reward. However, this criterion does not take risk into account. To deal with risky issues, risk sensitive Markov decision processes (RSMDPs) are used. To the best of our knowledge, RSMDPs and more specifically RSMDPs with exponential utility function have never been applied to handle this problem. In this paper we introduce a strategy to model the Portfolio Management problem focused on day trade operations in order to enable the use of dynamic programming. We also introduce a measure based on Conditional Value-at-Risk (CVaR) to evaluate the risk attitude. The experiments show that, with our model and with the use of RSMDPs with exponential utility function, it is possible to change and interpret the agent risk attitude in a very understandable way. Keywords: Markov decision process · Risk sensitive Markov decision process · Planning and scheduling · Portfolio management

1

Introduction

In the Portfolio Management problem [9] the agent has to decide how to allocate the resources among a set of stocks in order to maximize gains. Stock gains are stochastic and depend on the behavior of the market which can be calm or volatile. These characteristics of the problem, the access to real data and the vast number of assets available have attracted the attention of many researches and some of them try to tackle the problem modeling it as a Markov Decision Process [4,5,12]. Markov decision process (MDP) is a mathematical model [13] widely used in sequential decision-making problems and provides a mathematical framework to represent the interaction between an agent and an environment through the definition of a set of states, actions, transitions probabilities and rewards. In MDPs the agent must find an optimal policy (a mapping from states to actions) c Springer Nature Switzerland AG 2020  L. Mart´ınez-Villase˜ nor et al. (Eds.): MICAI 2020, LNAI 12468, pp. 370–382, 2020. https://doi.org/10.1007/978-3-030-60884-2_27

Risk Sensitive Markov Decision Process for Portfolio Management

371

that maximizes the accumulative discount reward. Many strategies to model the Portfolio Management problem as an MDP can be found in the literature [1,2,4,5,12]. Another aspect studied on the Portfolio Management problem is the risk involving decision making. Some works have applied risk averse models to the Portfolio Management problem [2,12] or have made use of measures with implicit risk aversion criteria [4], however in these models it is impossible to parameterize the agent risk behavior, when it is desired. A risk sensitive Markov decision processes (RSMDPs) [3,6–8,10,11], is an extension of an MDP used to model problems where the attitude of risk needs to be taken into account and to the best of our knowledge,