Minimum mean square error estimator for speech enhancement in additive noise assuming Weibull speech priors and speech p

  • PDF / 3,454,699 Bytes
  • 12 Pages / 595.276 x 790.866 pts Page_size
  • 43 Downloads / 192 Views

DOWNLOAD

REPORT


Minimum mean square error estimator for speech enhancement in additive noise assuming Weibull speech priors and speech presence uncertainty Mojtaba Bahrami1 · Neda Faraji1 Received: 5 May 2019 / Accepted: 29 October 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract A novel single-channel technique was proposed based on a minimum mean square error (MMSE) estimator to enhance short-time spectral amplitude (STSA) in the Discrete Fourier Transform (DFT) domain. In the present contribution, a Weibull distribution was used to model DFT magnitudes of clean speech signals under the additive Gaussian noise assumption. Moreover, the speech enhancement procedure was conducted with (WSPU) and without speech presence uncertainty (WoSPU). The theoretical spectral gain function was obtained as a weighted geometric mean of hypothetical gains associated with signal presence and absence. Extensive experiments were conducted with clean speech signals taken from the TIMIT database, which had been degraded by various additive non-stationary noise sources, and then enhanced signals were evaluated. The evaluation results demonstrated the outperformance of the proposed method compared to the probability density functions (PDF) of Rayleigh and Gamma distributions in terms of segmental signal-to-noise ratio (segSNR), general SNR, and perceptual evaluation of speech quality (PESQ). The performance in the WSPU case was also significantly improved compared to WoSPU, assuming Weibull speech priors in the MMSE-STSA based speech enhancement algorithm. Keywords  Speech enhancement · Weibull distribution · Speech presence uncertainty · Minimum mean square error estimation

1 Introduction Speech enhancement plays an important role in voice communication systems and is of great importance in the signal processing field. Before the signal can be transmitted to the listener, the noisy speech signal should be enhanced to achieve a reasonable and suitable quality of audio communications in noisy environments. The main purpose of speech enhancement algorithms is to recover the original clean speech signal from noisy observation, to improve perceived quality and/or intelligibility, reduce listener fatigue, and improve performance for automatic speech recognition.

* Neda Faraji [email protected] Mojtaba Bahrami [email protected] 1



Department of Electrical Engineering, Imam Khomeini International University, Qazvin, Iran

Over the recent decades, speech enhancement methods have been introduced on some topics such as spectral subtraction (SS) (Loizou 2013; Paliwal et al. 2010), subspace algorithm (Wei et al. 2013; Tong et al. 2015), MMSE (Erkelens et al. 2007; McCallum and Guillemin 2013; Bahrami and Faraji 2017; Kumar 2018) and wiener filtering (ElFattah et al. 2014; Modhave et al. 2016). The other available categorization of speech enhancement methods are divided into single-channel approaches (Lotter and Vary 2005; Bahrami and Seyedin 2018) and multi-channel ones (Souden et al. 2013; Kayser and Anemueller