Compact and Interpretable Dialogue State Representation with Genetic Sparse Distributed Memory
User satisfaction is often considered as the objective that should be achieved by spoken dialogue systems. This is why the reward function of Spoken Dialogue Systems (SDS) trained by Reinforcement Learning (RL) is often designed to reflect user satisfacti
- PDF / 240,338 Bytes
- 13 Pages / 439.37 x 666.142 pts Page_size
- 24 Downloads / 146 Views
Abstract User satisfaction is often considered as the objective that should be achieved by spoken dialogue systems. This is why the reward function of Spoken Dialogue Systems (SDS) trained by Reinforcement Learning (RL) is often designed to reflect user satisfaction. To do so, the state space representation should be based on features capturing user satisfaction characteristics such as the mean speech recognition confidence score for instance. On the other hand, for deployment in industrial systems there is a need for state representations that are understandable by system engineers. In this article, we propose to represent the state space using a Genetic Sparse Distributed Memory. This is a state aggregation method computing state prototypes which are selected so as to lead to the best linear representation of the value function in RL. To do so, previous work on Genetic Sparse Distributed Memory for classification is adapted to the Reinforcement Learning task and a new way of building the prototypes is proposed. The approach is tested on a corpus of dialogues collected with an appointment scheduling system. The results are compared to a gridbased linear parametrisation. It is shown that learning is accelerated and made more memory efficient. It is also shown that the framework is scalable in that it is possible to include many dialogue features in the representation, interpret the resulting policy and identify the most important dialogue features. L.El. Asri (B) · R. Laroche Orange Labs, Chatillon, France e-mail: [email protected] R. Laroche e-mail: [email protected] L.El. Asri UMI 2958 (CNRS-GeorgiaTech), Metz, France O. Pietquin University of Lille, CNRS, Lille, France e-mail: [email protected] O. Pietquin UMR 9189—CRIStAL, 59000 Lille, France O. Pietquin Institut Universitaire de France (IUF), Paris, France © Springer Science+Business Media Singapore 2017 K. Jokinen and G. Wilcock (eds.), Dialogues with Social Robots, Lecture Notes in Electrical Engineering 427, DOI 10.1007/978-981-10-2585-3_3
39
40
Keywords Spoken dialogue systems · Reinforcement learning representation · Genetic algorithms · Sparse distributed memory
L.El. Asri et al.
·
State space
1 Introduction Reinforcement Learning (RL) [1] is now a state of the art method to learn optimal policies for dialogue systems [2–6]. To do so, a reward function, describing how good a decision made by the system is, has to be designed. It encodes the goal of the system. On the other hand, a commonly used metric to assess dialogue management quality is user satisfaction [7, 8]. Since in RL the reward function defines the task of the system, it is natural to have the rewards reflect user satisfaction. There has been extensive research on automatically estimating user satisfaction for a given dialogue [9–11]. These studies have shown that many dialogue features (duration, mean speech recognition scores, number of help requests, …) could play an important role in user satisfaction [12, 13]. Because RL relies on a representation of th
Data Loading...