Evaluating Dialogue Strategies in Multimodal Dialogue Systems
Previous research suggests that multimodal dialogue systems providing both speech and pen input, and outputting a combination of spoken language and graphics, are more robust than unimodal systems based on speech or graphics alone (André, 2002; Oviatt, 19
- PDF / 3,206,346 Bytes
- 22 Pages / 594.72 x 841.68 pts Page_size
- 44 Downloads / 247 Views
Abstract
Previous research suggests that multimodal dialogue systems providing both speech and pen input, and outputting a combination of spoken language and graphics, are more robust than unimodal systems based on speech or graphics alone (Andr6, 2002; Oviatt, 1999). Such systems are complex to build and significant research and evaluation effort must typically be expended to generate well-tuned modules for each system component. This chapter describes experiments utilising two complementary evaluation methods that can expedite the design process: (1) a Wizard-of-Oz data collection and evaluation using a novel Wizard tool we developed; and (2) an Overhearer evaluation experiment utilising logged interactions with the real system. We discuss the advantages and disadvantages of both methods and summarise how these two experiments have informed our research on dialogue management and response generation for the multimodal dialogue system MATCH.
Keywords:
User modelling; Natural language generation; Wizard-of-Oz experiments; Overhearer method; User-adaptive generation; Multiattribute decision theory.
1.
Introduction
Multimodal dialogue systems promise users mobile access to a complex and constantly changing body of information. However, mobile information access devices such as PDAs, tablet PCs, and next-generation phones offer limited screen real-estate and no keyboard or mouse. Previous research suggests that spoken language interaction is highly desirable for such systems, and that systems that provide both speech and pen input, and that output a combination of spoken language and graphics, are more robust than unimodal systems (Andre, 2002; Oviatt, 1999). However, such systems are complex to build and typically significant research and evaluation effort must be expended 247
W. Minker, D. Bühler andL. Dybkjær (eds), Spoken Multimodal Human-Computer Dialogue in Mobile Environments, 247-268 © 2005 Springer. Printed in the Netherlands
248
SPOKEN MULTIMODAL HUMAN-COMPUTER DIALOGUE
to generate well-tuned modules for each system component. Furthermore, during the development process, it is necessary to evaluate individual components to inform the design process before the whole system is robust enough for data collection with real users. This chapter describes experiments utilising two complementary evaluation methods that can be applied to collect information useful for design during the design process itself. We summarise how we have used these methods to inform our research on improved algorithms for (a) dialogue management and (b) generation for information presentation in multimodal dialogue. Our testbed application is MATCH (Multimodal Access To City Help), a dialogue system providing information for New York City (Johnston and Bangalore, 2000; Bangalore and Johnston, 2000; Johnston and Bangalore, 2001; Johnston et al., 2002). MATCH runs standalone on a Fujitsu PDA, as shown in Figure 1, yet can also run in client-server mode across a wireless network. MATCH provides users with mobile access to restaurant
Data Loading...