The Role and Importance of Speech Standards

Within only a few years the landscape of speech and DTMF applications changed from being based on proprietary languages to being completely based on speech standards. In that, a role of primary importance was played by W3C Voice Browser Working Group (VBW

  • PDF / 209,520 Bytes
  • 18 Pages / 439.37 x 666.142 pts Page_size
  • 69 Downloads / 170 Views

DOWNLOAD

REPORT


The Role and Importance of Speech Standards Paolo Baggia, Daniel C. Burnett, Rob Marchand, and Val Matula

Abstract Within only a few years the landscape of speech and DTMF applications changed from being based on proprietary languages to being completely based on speech standards. In that, a role of primary importance was played by W3C Voice Browser Working Group (VBWG). This chapter describes this change, the implications, and highlights the standards created by the W3C VBWG, as well as the benefits that these standards can induce in many other application fields, including multi-modal interfaces.

2.1

Introduction

A strong wind of change was sweeping the stuffy world of Interactive Voice Response (IVR) and speech applications in general. This call for change developed in the very last years of the last century, resulting in a key event—the workshop on “Voice Browsers” held in Cambridge, MA on 13 October 1998 [1]. The workshop was sponsored by the W3C, and it raised huge interest in the standardization of voice application technologies. The direct result was the birth of a W3C Working Group—the Voice Browser Working Group (W3C VBWG [2]), formed to create an interconnected family of standards. This chapter offers a short introduction to most of the W3C VBWG standards and also describes their close relationship with the W3C Multimodal Interaction Working Group (W3C MMI [3]).

P. Baggia (*) Department of Enterprise, Nuance Communications, Inc., Torino, Italy e-mail: [email protected] D.C. Burnett StandardsPlay, Lilburn, GA, USA R. Marchand Genesys, Markham, ON, Canada V. Matula Avaya Inc., Santa Clara, CA, USA © Springer International Publishing Switzerland 2017 D.A. Dahl (ed.), Multimodal Interaction with W3C Standards, DOI 10.1007/978-3-319-42816-1_2

19

20

P. Baggia et al.

Several factors combined to drive this change; the most relevant ones are – The development of an IVR application was cumbersome and required the use of proprietary IDEs that were bound to individual vendors. At the time of the formation of the VBWG, IVR technology was proprietary and there was virtually no chance to exchange expertise or application assets between them. – Speech technologies were very limited in their use; only simple commands, menu options, and sequences of digits were allowed. However, the core speech technologies were rapidly evolving to be more powerful and flexible and to allow a new generation of speech applications. – Voice interactions were limited to simple menu navigation, with no flexibility to allow more advanced dialog capabilities. Their implementation was clumsy. – But the most powerful factor was the advent of the Internet era: the HTTP protocol, the HTML language, and the flourishing of web sites. All these advances were based on public standards, while the world of voice applications was missing the opportunity to follow these new trends. It was this combination of factors that drove the creation of the W3C VBWG, changing forever the world of voice applications. With the Working Group, a large num