Noise effect on Amazigh digits in speech recognition system

PDF / 1,719,003 Bytes
8 Pages / 595.276 x 790.866 pts Page_size
22 Downloads / 310 Views

Noise effect on Amazigh digits in speech recognition system Ouissam Zealouk1 · Hassan Satori1 · Naouar Laaidi1 · Mohamed Hamidi1 · Khalid Satori1 Received: 28 January 2020 / Accepted: 21 October 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Automatic Speech Recognition (ASR) for Amazigh speech, particularly Moroccan Tarifit accented speech, is a less researched area. This paper focuses on the analysis and evaluation of the first ten Amazigh digits in the noisy conditions from an ASR perspective based on Signal to Noise Ratio (SNR). Our testing experiments were performed under two types of noise and repeated with added environmental noise with various SNR ratios for each kind ranging from 5 to 45 dB. Different formalisms are used to develop a speaker independent Amazigh speech recognition, like Hidden Markov Model (HMMs), Gaussian Mixture Models (GMMs). The experimental results under noisy conditions show that degradation of performance was observed for all digits with different degrees and the rates under car noisy environment are decreased less than grinder conditions with the difference of 2.84% and 8.42% at SNR 5 dB and 25 dB, respectively. Also, we observed that the most affected digits are those which contain the "S" alphabet. Keywords Automatic speech recognition system · Amazigh language · Hidden markov model · Sphinx4 · Noise

1 Introduction Speech Recognition is the process of converting a speech signal to a sequence of words based on algorithms. Recently, it has become more popular as an input mechanism in several computer applications. The Automatic Speech Recognition systems performance degrades considerably when speech is corrupted by background noise not seen during training where the reason is the observed speech signal does no longer match the distributions derived from the training material. There have been many approaches that aim at solving this mismatch, such as speech features normalization or improvement to remove the corrupting noise from the observations prior to recognition (Yu et al. (2008)), acoustic models compensation (Moreno et al. 1996; Gales and Young 1996) and using the recognizer architectures that use only the least noisy observations (Raj and Stern 2005). Lee et al. (2009) have combined enhancement of speech with endpoint detection and discrimination of the speech/non-speech

* Hassan Satori [email protected] 1

Laboratory Computer Science, Image Processing and Numerical Analysis, Faculty of Sciences Dhar Mahraz, Sidi Mohammed Ben Abbdallah University, B.P. 1796, Fez, Morocco

in a commercial application. Authors in (Kim and Stern (2009)) have presented a new noise robust frontend method and compared to different noise conditions. Model adaptation methods staying the observations unaltered and make updating the recognizer model parameters for giving a more observed speech representative, e.g. (Li et al. 2007; Hu et al. 2006; Seltzer et al. 2010). These approaches can be further enhanced by using different conditions training dat

Data Loading...

Noise effect on Amazigh digits in speech recognition system

Recommend Documents

Isolated Word Automatic Speech Recognition System

Noise Reduction in Speech Processing

Optimal parameters selected for automatic recognition of spoken Amazigh digits and letters using Hidden Markov Model Too

A Review of Signal Subspace Speech Enhancement and Its Application to Noise Robust Speech Recognition

Speech Recognition

Exploring Blockchain in Speech Recognition

Contribution of noise reduction pre-processing and microphone directionality strategies in the speech recognition in noi

Design and Implementation of a SoPC System for Speech Recognition

MP3 Compression to Diminish Adversarial Noise in End-to-End Speech Recognition

Toward Lexicon-Free Bangla Automatic Speech Recognition System

Robust In-Car Speech Recognition Based on Nonlinear Multiple Regressions

Intelligent online education system based on speech recognition with specialized analysis on quality of service