BEAT: the Behavior Expression Animation Toolkit

The Behavior Expression Animation Toolkit (BEAT) allows animators to input typed text that they wish to be spoken by an animated human figure, and to obtain as output appropriate and synchronized non-verbal behaviors and synthesized speech in a form that

  • PDF / 2,836,443 Bytes
  • 23 Pages / 439.37 x 666.142 pts Page_size
  • 13 Downloads / 206 Views

DOWNLOAD

REPORT


2

3

MIT Media Laboratory, 20 Ames St., E15-315 Cambridge, MA 02139, USA [email protected] MIT Media Laboratory, 20 Ames St., E15-320R Cambridge, MA 02139, USA [email protected] MIT Media Laboratory, 20 Ames St., E15-320Q Cambridge, MA 02139, USA [email protected]

Summary. The Behavior Expression Animation Toolkit (BEAT) allows animators to input typed text that they wish to be spoken by an animated human figure, and to obtain as output appropriate and synchronized non-verbal behaviors and synthesized speech in a form that can be sent to a number of different animation systems. The non-verbal behaviors are assigned on the basis of actual linguistic and contextual analysis of the typed text, relying on rules derived from extensive research into human conversational behavior. The toolkit is extensible, so that new rules can be quickly added. It is designed to plug into larger systems that may also assign personality profiles, motion characteristics, scene constraints, or the animation styles of particular animators.

1 Introduction The association between speech and other communicative behaviors poses particular challenges to procedural character animation techniques. Increasing numbers of procedural animation systems are capable of generating extremely realistic movement, hand gestures, and facial expressions in silent characters. However, when voice is called for, issues of synchronization and appropriateness render disfiuent otherwise more than adequate techniques. And yet there are many cases where we may want to animate a speaking character. While * This chapter is a reprint from the Proceedings of SIGGRAPH'Ol, August 12-17, Los Angeles, CA (ACM Press 2001), pp. 477-486. The chapter has been adapted in style for consistency.

H. Prendinger et al. (eds.), Life-Like Characters © Springer-Verlag Berlin Heidelberg 2004

164

Justine Cassell, Hannes Vilhjalmsson, and Timothy Bickmore

spontaneous gesturing and facial movement occurs naturally and effortlessly in our daily conversational activity, when forced to think about such associations between non-verbal behaviors and words in explicit terms a trained eye is called for. For example, untrained animators, and some autonomous animated interfaces, often generate a pointing gesture toward the listener when a speaking character says "you". ("If you want to come with me, get your coat on.") A point of this sort, however, never occurs in life (try it yourself and you will see that only if "you" is being contrasted with somebody else might a pointing gesture occur) and, what is much worse, makes an animated speaking character seem stilted, as if speaking a language not its own. In fact, for this reason, many animators rely on video footage of actors reciting the text, for reference or rotoscoping, or more recently, rely on motion captured data to drive speaking characters. These are expensive methods that may involve a whole crew of people in addition to the expert animator. This may be worth doing for characters that play a central role on the screen, but