Performance analysis of neural network, NMF and statistical approaches for speech enhancement

PDF / 1,944,078 Bytes
21 Pages / 595.276 x 790.866 pts Page_size
56 Downloads / 238 Views

Performance analysis of neural network, NMF and statistical approaches for speech enhancement Ravi Kumar Kandagatla1 · Venkata Subbaiah Potluri2 Received: 3 May 2019 / Accepted: 2 September 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Bayesian Estimators are very useful in speech enhancement and noise reduction. But, it is noted that the traditional estimators process only amplitudes and the phase is left unprocessed. Among the Bayesian estimators, Super- Gaussian based estimators provide improved noise reduction. Super-Gaussian Bayesian estimators, which uses processed phase information for estimation of amplitudes provides further improved results. In this work, the Complex speech coefficients given Uncertain Phase (CUP) based Bayesian estimators like CUP-GG (CUP Estimator with speech spectral coefficients assumed as Gamma and noise spectral coefficients as Generalized Gamma), CUP-NG (Speech as Nakagami) are compared under white noise, pink noise, Babble noise and Non-Stationary factory noise conditions. The statistical estimators show less effective results under completely non-stationary assumptions like non-stationary factory noise, babble noise etc. Non-negative Matrix Factorization (NMF) based algorithms show better performance for non stationary noises. The drawback of NMF is, it requires apriori knowledge about speech. This drawback can be overcome by taking the advantages of both statistical approaches and NMF approaches. NR-NMF and WR-NMF speech enhancement methods are developed by providing posteriori regularization based on statistical assumption of speech and noise DFT coefficients distribution. Also a speech enhancement method which uses CUP-GG estimator and NMF with online noise bases update are considered for comparison. The progress in neural network based approaches for speech enhancement further shown that with large dataset and better training, the speech enhancement algorithms results in improved results. In this work, the neural network approach for speech enhancement is implemented and compared the method with traditional estimators and NMF approaches. For generalization of unseen noise types the proposed neural network approach uses dropout. Also for training the network, the features obtained from apriori SNR and aposteriori SNR is used in this method. The objective of this paper is to analyze the performance of speech enhancement methods based on Neural Network, NMF and statistical based. The objective performance measures Perceptual Evaluation of Speech Quality (PESQ), Short-Time Objective Intelligibility (STOI), Signal to Noise Ratio (SNR), Segmental SNR (Seg SNR) are considered for comparison. Keywords Non-negative matrix factorization (NMF) · CUP estimator · Noise reduction · Probability density function · Dropout · Apriori SNR and aposteriori SNR · PESQ

1 Introduction * Ravi Kumar Kandagatla [email protected] 1

Department of Electronics and Communication Engineering, Lakireddy Bali Reddy College of Engineering (Autonomous), Mylavaram, Kr

Data Loading...

Performance analysis of neural network, NMF and statistical approaches for speech enhancement

Recommend Documents

Correction to: Performance analysis of neural network, NMF and statistical approaches for speech enhancement

A Research on Speech Enhancement Based on Hybrid Parallel Subbands HMM and Neural Network Model

Research on Parameter Configuration of Deep Neural Network Applied on Speech Enhancement

Canonical Correlation Analysis in Speech Enhancement

Fundamentals of Speech Enhancement

A Comparison of Speech-to-Speech Neural Network Methodologies for Digit Pronunciation

Radial Basis Function Neural Network Based Speech Enhancement System Using SLANTLET Transform Through Hybrid Vector Wien

Speech Emotion Recognition UsingConvolutional Neural Network and Long-Short TermMemory

Benchmarking deep neural network approaches for Indian Sign Language recognition

Neural Network Configuration for Pollen Analysis

Speech Enhancement via EMD

Statistical Analysis of Network Data with R