Multimodal Dialogue Systems
Multimodal systems are moving to the centre-stage of dialogue research, reflecting the maturing of component technologies and the recognition that “interface” per se may not be the best way to describe the functionality of such systems. Rather they are em
- PDF / 573,399 Bytes
- 9 Pages / 594.72 x 841.68 pts Page_size
- 115 Downloads / 245 Views
USA
Abstract
Multimodal systems are moving to the centre-stage of dialogue research, reflecting the maturing of component technologies and the recognition that "interface" per se may not be the best way to describe the functionality of such systems. Rather they are emerging as full communications systems, with a corresponding rich set of expressive capabilities.
Keywords:
Modes and modalities; Context; Intention learning and adaptation.
1.
Introduction
Multimodal interfaces allow humans to create inputs for a machine in a natural and concise form using the mode or mixture of modes that most precisely convey the intended meaning and to adjust this mix to reflect communication needs. For example, specifying a travel destination is most easily done by voice, but describing the shape of an object, especially an irregular one, is more easily done by gesture. Embedding such inputs in a consistent sequence of interactions, a dialogue, produces a rich conversation that might otherwise be difficult to carry out with the same degree of efficiency using any single mode. One could even say that true communication is only possible when human (and machine) expression is allowed to range over multiple modes, optimally matching the needs of the communication to the modes available. Having made this observation, we are still left with the problem of understanding what principles might govern multimodal communication and how these principles might be reduced to practice. The papers in this section present a cross-section of work in the area of multimodal dialogue systems that attempt to address this question. While each application area presents the researcher with its own unique set of challenges we can nevertheless discern a number of common themes that cut across individual system building efforts. These comW. Minker, D. Bühler andL. Dybkjær (eds), Spoken Multimodal Human-Computer Dialogue in Mobile Environments, 3-11 © 2005 Springer. Printed in the Netherlands
4
SPOKEN MULTIMODAL HUMAN-COMPUTER DIALOGUE
monalities could be understood to define the agenda for contemporary research in multimodal dialogue.
2.
Varieties of Multimodal Dialogue
The term "multimodal" can have several meanings. It is useful to keep these distinct. The papers in this section mostly understand multimodal to refer to interfaces that provide the human with the opportunity to produce multiple separate input streams; for example voice and gesture. As implemented in such systems the streams are under the voluntary control of the user, in the sense that the human, in the process of composing an input, will have the option of consciously deciding to use one or the other or a combination of modes to formulate an input that is appropriate for a particular situation. But the term multimodal can also be used to describe other types of inputs that humans naturally produce in the process of communication. For example, we can also understand multimodal in terms of recognition that takes in multiple coordinate streams of information generated by the user,
Data Loading...