Multi-cost Bounded Tradeoff Analysis in MDP
- PDF / 2,449,499 Bytes
- 40 Pages / 439.37 x 666.142 pts Page_size
- 98 Downloads / 196 Views
Multi-cost Bounded Tradeoff Analysis in MDP Arnd Hartmanns1 Tim Quatmann2
· Sebastian Junges2
· Joost-Pieter Katoen1,2
·
Received: 23 June 2020 / Accepted: 2 July 2020 © The Author(s) 2020
Abstract We provide a memory-efficient algorithm for multi-objective model checking problems on Markov decision processes (MDPs) with multiple cost structures. The key problem at hand is to check whether there exists a scheduler for a given MDP such that all objectives over cost vectors are fulfilled. We cover multi-objective reachability and expected cost objectives, and combinations thereof. We further transfer approaches for computing quantiles over single cost bounds to the multi-cost case and highlight the ensuing challenges. An empirical evaluation shows the scalability of our new approach both in terms of memory consumption and runtime. We discuss the need for more detailed visual presentations of results beyond Pareto curves and present a first visualisation approach that exploits all the available information from the algorithm to support decision makers. Keywords Markov decision process · Multi-objective verification · Pareto-optimal strategies · Cost-bounded reachability · Expected rewards · Probabilistic model checking
1 Introduction Markov decision processes [46] (MDPs) with rewards or costs are a popular model to describe planning problems under uncertainty. Planning algorithms aim to find strategies which perform well (or even optimally) for a given objective. These algorithms typically assume that
The authors are listed in alphabetical order. This work was supported by DFG RTG 2236 “UnRAVeL” and NWO VENI Grant 639.021.754.
B
Arnd Hartmanns [email protected] Sebastian Junges [email protected] Joost-Pieter Katoen [email protected]; [email protected] Tim Quatmann [email protected]
1
University of Twente, Enschede, The Netherlands
2
RWTH Aachen, Aachen, Germany
123
A. Hartmanns et al.
(a)
(b)
Fig. 1 Science on Mars: planning under several resource constraints
a goal is reached eventually (with probability one) and optimise the expected reward or cost to reach that goal [46,53]. This assumption however is unrealistic in many scenarios, e.g. due to insufficient resources or the possibility of attempted actions failing. Furthermore, the resulting optimal schedulers often admit single runs which perform far below the user’s expectation. Such deviations to the expected value are unsuitable in many scenarios with high stakes. Examples range from deliveries reaching an airport after the plane’s departure to more serious scenarios in e.g. wildfire management [56]. In particular, many practical scenarios call for minimising the probability to run out of resources before reaching the goal: while it is beneficial for a plane to reach its destination with low expected fuel consumption, it is essential to reach its destination with the fixed available amount of fuel. Schedulers that optimise solely for the probability to reach a goal are mostly very expensive. Even
Data Loading...