Pattern recognition and features selection for speech emotion recognition model using deep learning

PDF / 1,600,419 Bytes
8 Pages / 595.276 x 790.866 pts Page_size
87 Downloads / 465 Views

Pattern recognition and features selection for speech emotion recognition model using deep learning Kittisak Jermsittiparsert1 · Abdurrahman Abdurrahman2 · Parinya Siriattakul3 · Ludmila A. Sundeeva4 · Wahidah Hashim5 · Robbi Rahim6 · Andino Maseleno7 Received: 12 November 2019 / Accepted: 17 February 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Automatic speaker recognizing models consists of a foundation on building various models of speaker characterization, pattern analyzing and engineering. The effect of classification and feature selection methods for the speech emotion recognition is focused. The process of selecting the exact parameter in arrangement with the classifier is an important part of minimizing the difficulty of system computing. This process becomes essential particularly for the models which undergo deployment in real time scenario. In this paper, a new deep learning speech based recognition model is presented for automatically recognizes the speech words. The superiority of an input source, i.e. speech sound in this state has straight impact on a classifier correctness attaining process. The Berlin database consist around 500 demonstrations to media persons that is both male and female. On the applied dataset, the presented model achieves a maximum accuracy of 94.21%, 83.54%, 83.65% and 78.13% under MFCC, prosodic, LSP and LPC features. The presented model offered better recognition performance over the other methods. Keywords Deep learning · Speech · Emotion recognition · Feature extraction

1 Introduction The improvements in application with services are interesting to organize normal communication among human and machine. Indicating some of the orders through voice and movements is familiar in recent days. Enormous amount of data is gained from the audio of humans with better accuracy, human speech also comprises of alternative

information that has assets of the speaker such as age, gender, emotional condition, audio fault, and other characteristics in human audio. To declare input feature is efficient due to the simulation of speech features from each others with best act skills. The title itself describes about the models for the classification of emotion regarding human audio. Emotion is one of the crucial parameter in humans that represents their mental state that affects physiologically, whereas the 1

Ton Duc Thang University, Ho Chi Minh City, Vietnam

2

Kittisak Jermsittiparsert [email protected]

Physics Education Department, Lampung University, Tanjungkarang, Indonesia

3

Abdurrahman Abdurrahman [email protected]

School of Psychology, University of Queensland, Brisbane, Australia

* Andino Maseleno [email protected]

4

Parinya Siriattakul [email protected]

Togliatti State University, Tolyatti, Russia

5

Ludmila A. Sundeeva [email protected]

Institute of Informatics and Computing Energy, Universiti Tenaga Nasional, Kajang, Malaysia

6

Sekolah Tinggi Ilmu Man

Data Loading...

Pattern recognition and features selection for speech emotion recognition model using deep learning

Recommend Documents

Speech and Facial Based Emotion Recognition Using Deep Learning Approaches

Multi-features Integration for Speech Emotion Recognition

Hybrid-Deep Learning Model for Emotion Recognition Using Facial Expressions

Deep Residual Local Feature Learning for Speech Emotion Recognition

Deep Learning for NLP and Speech Recognition

Speech Emotion Recognition Using Spectrogram Patterns as Features

Emotion Recognition in Speech with Deep Learning Architectures

Fisher Kernels on Phase-Based Features for Speech Emotion Recognition

Pattern Recognition for Speech Detection

Significance of Phonological Features in Speech Emotion Recognition

Attention and Feature Selection for Automatic Speech Emotion Recognition Using Utterance and Syllable-Level Prosodic Fea

Performance of deer hunting optimization based deep learning algorithm for speech emotion recognition