Speaker independent feature selection for speech emotion recognition: A multi-task approach
- PDF / 781,828 Bytes
- 20 Pages / 439.37 x 666.142 pts Page_size
- 63 Downloads / 252 Views
Speaker independent feature selection for speech emotion recognition: A multi-task approach Elham Kalhor 1 & Behzad Bakhtiari 1 Received: 4 May 2019 / Revised: 25 August 2020 / Accepted: 19 October 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract
Nowadays, automatic speech emotion recognition has numerous applications. One of the important steps of these systems is the feature selection step. Because it is not known which acoustic features of person’s speech are related to speech emotion, much effort has been made to introduce several acoustic features. However, since employing all of these features will lower the learning efficiency of classifiers, it is necessary to select some features. Moreover, when there are several speakers, choosing speaker-independent features is required. For this reason, the present paper attempts to select features which are not only related to the emotion of speech, but are also speaker-independent. For this purpose, the current study proposes a multi-task approach which selects the proper speaker-independent features for each pair of classes. The selected features are then given to the classifier. Finally, the outputs of the classifiers are appropriately combined to achieve an output of a multi-class problem. Simulation results reveal that the proposed approach outperforms other methods and offers higher efficiency in terms of detection accuracy and runtime. Keywords Speech emotion recognition . Multi-task feature selection . Speaker independent features
1 Introduction In the formation of an individual’s emotions, there is a set of emotions, such as happiness, sadness, anger, disgust, boredom, surprise, fear, and neutrality. Emotions play a prominent role in human communications. They are critical for exhibiting behavior under different conditions.
* Behzad Bakhtiari [email protected] Elham Kalhor [email protected]
1
Department of Computer Engineering, Sadjad University of Technology, No. 64 Jalal Al Ahmad St, 9188148848 Mashhad, Iran
Multimedia Tools and Applications
On one hand, emotions cause psychological changes, which form in the brain and manifest themselves in human reactions. In addition, emotions increase or decrease the physiological stimuli of the body and positively or negatively impact behavior and thoughts. Moreover, the emotion of speech depends on the speaker’s language and culture, gender, age, speech content, and other factors [12, 20]. Speech emotion recognition offers numerous applications in human-machine communication systems. For example, there are different applications in the field of education, computer games, medicine, customer communication systems, telephone centers, and mobile communications [5, 31, 32, 44, 45]. Nevertheless, automatic speech emotion recognition requires an acoustic feature extraction. Since there is no information about which features are related to a speaker’s emotions, many researchers have proposed several features. Unfortunately, employing all of these may pose two basic challen
Data Loading...