Time-Varying Noise Estimation for Speech Enhancement and Recognition Using Sequential Monte Carlo Method

PDF / 2,065,929 Bytes
19 Pages / 600 x 792 pts Page_size
16 Downloads / 241 Views

Time-Varying Noise Estimation for Speech Enhancement and Recognition Using Sequential Monte Carlo Method Kaisheng Yao Institute for Neural Computation, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0523, USA Email: [email protected]

Te-Won Lee Institute for Neural Computation, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0523, USA Email: [email protected] Received 4 May 2003; Revised 9 April 2004 We present a method for sequentially estimating time-varying noise parameters. Noise parameters are sequences of time-varying mean vectors representing the noise power in the log-spectral domain. The proposed sequential Monte Carlo method generates a set of particles in compliance with the prior distribution given by clean speech models. The noise parameters in this model evolve according to random walk functions and the model uses extended Kalman filters to update the weight of each particle as a function of observed noisy speech signals, speech model parameters, and the evolved noise parameters in each particle. Finally, the updated noise parameter is obtained by means of minimum mean square error (MMSE) estimation on these particles. For eﬃcient computations, the residual resampling and Metropolis-Hastings smoothing are used. The proposed sequential estimation method is applied to noisy speech recognition and speech enhancement under strongly time-varying noise conditions. In both scenarios, this method outperforms some alternative methods. Keywords and phrases: sequential Monte Carlo method, speech enhancement, speech recognition, Kalman filter, robust speech recognition.

1.

INTRODUCTION

A speech processing system may be required to work in conditions where the speech signals are distorted due to background noise. Those distortions can drastically drop the performance of automatic speech recognition (ASR) systems, which usually perform well in quiet environments. Similarly, speech-coding systems spend much of their coding capacity encoding additional noise information. There have been great interests in developing algorithms that achieve robustness to those distortions. In general, the proposed methods can be grouped into two approaches. One approach is based on front-end processing of speech signals, for example, speech enhancement. Speech enhancement can be done either in time-domain, for example, in [1, 2], or more widely used, in spectral domain [3, 4, 5, 6, 7]. The objective of speech enhancement is to increase signal-to-noise ratio (SNR) of the processed speech with respect to the observed noisy speech signal.

The second approach is based on statistical models of speech and/or noise. For example, parallel model combination (PMC) [8] adapts speech mean vectors according to the input noise power. In [9], code-dependent cepstral normalization (CDCN) modifies speech signals based on probabilities from speech models. Since methods in this modelbased approach are devised in a principled way, for example, maximum likelihood estimation [9], they usually have better p

Data Loading...

Time-Varying Noise Estimation for Speech Enhancement and Recognition Using Sequential Monte Carlo Method

Recommend Documents

Sequential Quasi-Monte Carlo

An Introduction to Sequential Monte Carlo

A Review of Signal Subspace Speech Enhancement and Its Application to Noise Robust Speech Recognition

Parallel sequential Monte Carlo for stochastic gradient-free nonconvex optimization

Subsampling sequential Monte Carlo for static Bayesian models

Monte Carlo and Quasi-Monte Carlo Sampling

Text-independent speaker recognition using LSTM-RNN and speech enhancement

Contributions to the adaptive Monte Carlo method

Wavepacket phase-space quantum Monte Carlo method

Real-time speech enhancement algorithm for transient noise suppression

Multichannel Direction-Independent Speech Enhancement Using Spectral Amplitude Estimation

A Sequential Monte Carlo Framework for Adaptive Bayesian Model Discrimination Designs Using Mutual Information