Reinforcement learning applied to airline revenue management
- PDF / 3,031,593 Bytes
- 17 Pages / 595.276 x 790.866 pts Page_size
- 116 Downloads / 585 Views
RESEARCH ARTICLE
Reinforcement learning applied to airline revenue management Nicolas Bondoux1 · Anh Quan Nguyen1 · Thomas Fiig2 · Rodrigo Acuna‑Agost1 Received: 10 December 2018 / Accepted: 8 November 2019 © Springer Nature Limited 2020
Abstract Reinforcement learning (RL) is an area of machine learning concerned with how agents take actions to optimize a given long-term reward by interacting with the environment they are placed in. Some well-known recent applications include selfdriving cars and computers playing games with super-human performance. One of the main advantages of this approach is that there is no need to explicitly model the nature of the interactions with the environment. In this work, we present a new airline Revenue Management System (RMS) based on RL, which does not require a demand forecaster. The optimization module remains but works in a different way. It is theoretically proven that RL converges to the optimal solution; however, in practice, the system may require a significant amount of data (a booking history with millions of daily departures) to learn the optimal policies. To overcome these difficulties, we present a novel model that integrates domain knowledge with a deep neural network trained on GPUs. The results are very encouraging in different scenarios and open the door for a new generation of RMSs that could automatically learn by directly interacting with customers. Keywords Revenue Management System · Machine Learning · Reinforcement Learning · Deep Reinforcement Learning · Q-Learning · Deep Q-Learning
Introduction and motivation
• RMS explicitly assumes that customer demand follows a
Airlines have relied on Revenue Management Systems (RMSs) to optimize their revenue since the mid to late 1970s. While RMS has evolved to support dramatic changes in airline business models over time—from state-owned monopolies in a heavily regulated environment to a deregulated competitive marketplace—several critical assumptions have remained embedded in RMS since its inception:
• RMS practices only passive learning, by always offering
* Thomas Fiig [email protected] Nicolas Bondoux [email protected] Anh Quan Nguyen [email protected] Rodrigo Acuna‑Agost [email protected] 1
Research, Innovation and Ventures, Amadeus S.A.S., 485, Route du Pin Montard, 06902 Sophia Antipolis Cedex, France
Amadeus IT Group, Lufthavnsboulevarden 14, 2770 Kastrup, Denmark
2
parametric demand model specified by a system designer.
the myopic revenue-maximizing price. It never actively explores different prices to validate its assumptions. • RMS does not explicitly account for competition; it is only indirectly aware of how competitor prices and availabilities affect its own bookings. RMS’s parametric demand model provides robustness (compared to a non-parametric model) and allows the model parameters to be interpreted by human RM analysts. However, this may also be a weak point, if actual customer behavior deviates from the assumed demand model. Furthermore, there is no guara
Data Loading...