Emotion Modelling via Speech Content and Prosody: In Computer Games and Elsewhere

The chapter describes a typical modern speech emotion recognition engine as can be used to enhance computer games’ or other technical systems’ emotional intelligence. Acquisition of human affect via the spoken content and its prosody and further acoustic

PDF / 193,404 Bytes
18 Pages / 439.36 x 666.15 pts Page_size
93 Downloads / 212 Views

DOWNLOAD

REPORT

Emotion Modelling via Speech Content and Prosody: In Computer Games and Elsewhere Björn Schuller

Abstract The chapter describes a typical modern speech emotion recognition engine as can be used to enhance computer games’ or other technical systems’ emotional intelligence. Acquisition of human affect via the spoken content and its prosody and further acoustic features is highlighted. Features for both of these information streams are shortly discussed along chunking of the stream. Decision making with and without training data is presented, each. A particular focus is then laid on autonomous learning and adaptation methods as well as the required calculation of confidence measures. Practical aspects include the encoding of the information, distribution of the processing, and available toolkits. Benchmark performances are given by typical competitive challenges in the field.

Introduction The automatic recognition of emotion in speech dates back some twenty years by now looking back at the very first attempts, cf. e.g., [9]. It is the aim of this chapter to give a general glance ‘under the hud’ how today’s engines work. First, a very brief overview on modelling of emotion is given. A special focus is then laid on speech emotion recognition in computer games owing to the context of this book. Finally, the structure of the remaining chapter is provided aiming at familiarising the reader with the general principles of current engines and their abilities, principles, and necessities.

Emotion Modelling A number of different representation forms have been evaluated, with the most popular ones being discrete emotion classes such as ‘anger’, ‘joy’, or ‘neutral’ – usually reaching from two to roughly a dozen [51] depending on the

B. Schuller () Imperial College London, 180 Queen’s Gate, SW7 2AZ London, UK e-mail: [email protected] © Springer International Publishing Switzerland 2016 K. Karpouzis, G.N. Yannakakis (eds.), Emotion in Games, Socio-Affective Computing 4, DOI 10.1007/978-3-319-41316-7_5

85

86

B. Schuller

application of interest –, and a representation by continuous emotion ‘primitives’ in the sense of a number of (quasi-)value-continuous dimensions such as arousal/activation, valence/positivity/sentiment, dominance/power/potency, expectation/surprise/novelty, or intensity [43]. In a space spanned by these axes, the classes can be assigned as points or regions, thus allowing for a ‘translation’ between these two representation forms. Other popular approaches include tagging by allowing several class labels per instance of analysis (in case of two, the name complex emotions has been used), and calculating scores per each emotion class leading to ‘soft emotion profiles’ [32] – potentially with a minimum threshold to be exceeded. Besides choosing such a representation of emotion, one has to choose a temporal segmentation from, as the speech needs to be segmented into units of analysis. This analysis itself can be based on the spoken content or the ‘way of speaking’ it in the sense of prosody, arti

Data Loading...

Emotion Modelling via Speech Content and Prosody: In Computer Games and Elsewhere

Recommend Documents

Speech Prosody in Speech Synthesis: Modeling and generation of prosody for high quality and flexible speech synthesis

Speech Production and Speech Modelling

Computer Games Fourth Workshop on Computer Games, CGW 2015, and

Analyzing Emotion in Spontaneous Speech

Emotion and Depression Detection from Speech

Computer Games and Language Learning

Market Games and Content Distribution

A Content and Knowledge Management System Supporting Emotion Detection from Speech

Emotion Detection Throughout the Speech

Computer Games Third Workshop on Computer Games, CGW 2014, Held

Speech Act Pluralism, Minimal Content and Pragmemes

Computer Games 5th Workshop on Computer Games, CGW 2016, and 5th Wor