Emotion Recognition Using Excitation Source Information
This chapter provides the details of various excitation source features used for recognizing the emotions. The motivation to explore the excitation source information for emotion recognition is illustrated by demonstrating the speech files with source inf
- PDF / 875,348 Bytes
- 32 Pages / 439.36 x 666.15 pts Page_size
- 39 Downloads / 213 Views
Emotion Recognition Using Excitation Source Information
Abstract This chapter provides the details of various excitation source features used for recognizing the emotions. The motivation to explore the excitation source information for emotion recognition is illustrated by demonstrating the speech files with source information alone. Details of extraction of proposed excitation source features ((i) Sequence of LP residual samples, (ii) LP residual phase, (iii) Epoch parameters and (iv) Glottal pulse parameters) are given. Two emotional speech databases are introduced to validate the proposed excitation source features. Functionality of classification models such as auto-associative neural networks and support vector machines is briefly explained. Finally, recognition performance using the proposed excitation source features is analyzed in detail.
3.1 Introduction In the previous two chapters, speech emotion recognition is introduced as one of the important research areas and the related work is also discussed. This chapter basically deals with the use of excitation source information for recognizing underlying emotions from speech utterances [128]. Among the different speech information sources, excitation source information is treated almost like a noise and not contain information beyond the fundamental frequency of speech (because it mostly contains unpredictable part of the speech), and grossly ignored by speech research community[128]. Some of the speech systems developed using excitation source features are discussed in the second chapter. However, systematic study has not been carried out on speech emotion recognition using excitation information. The linear prediction (LP) residual represents the prediction error in the LP analysis of speech, and it is considered as the excitation signal to the vocal tract system, while producing the speech. In this chapter, we have explored the features extracted from the LP residual, epochs and GVV waveform for classifying the speech emotions. These features are referred to as excitation source or simply source features.
S.R. Krothapalli and S.G. Koolagudi, Emotion Recognition using Speech Features, SpringerBriefs in Electrical and Computer Engineering, DOI 10.1007/978-1-4614-5143-3 3, © Springer Science+Business Media New York 2013
35
36
3 Emotion Recognition Using Excitation Source Information
Epoch is an event representing the instant of glottal closure during the production of voiced speech. The glottal volume velocity (GVV) signal represents the airflow pattern through the glottis. The chapter is organized as follows. Motivation for exploring the excitation source features is discussed in Sect. 3.2. The emotional speech corpora used in this work are explained in brief in Sect. 3.3. Section 3.4 contains the detailed explanation extraction of various excitation source features, proposed in this chapter. Section 3.5 discusses the details of the classification models used in this work for classifying the emotions. Section 3.6 discusses the results of speech emoti
Data Loading...