Multichannel Direction-Independent Speech Enhancement Using Spectral Amplitude Estimation
- PDF / 736,038 Bytes
- 10 Pages / 600 x 792 pts Page_size
- 48 Downloads / 209 Views
Multichannel Direction-Independent Speech Enhancement Using Spectral Amplitude Estimation Thomas Lotter Institute of Communication Systems and Data Processing, Aachen University (RWTH), Templergraben 55, D-52056 Aachen, Germany Email: [email protected]
Christian Benien Philips Research Center, Aachen, Weißhausstraße 2, D-52066 Aachen, Germany Email: [email protected]
Peter Vary Institute of Communication Systems and Data Processing, Aachen University (RWTH), Templergraben 55, D-52056 Aachen, Germany Email: [email protected] Received 25 November 2002 and in revised form 12 March 2003 This paper introduces two short-time spectral amplitude estimators for speech enhancement with multiple microphones. Based on joint Gaussian models of speech and noise Fourier coefficients, the clean speech amplitudes are estimated with respect to the MMSE or the MAP criterion. The estimators outperform single microphone minimum mean square amplitude estimators when the speech components are highly correlated and the noise components are sufficiently uncorrelated. Whereas the first MMSE estimator also requires knowledge of the direction of arrival, the second MAP estimator performs a direction-independent noise reduction. The estimators are generalizations of the well-known single channel MMSE estimator derived by Ephraim and Malah (1984) and the MAP estimator derived by Wolfe and Godsill (2001), respectively. Keywords and phrases: speech enhancement, microphone arrays, spectral amplitude estimation.
1.
INTRODUCTION
Speech communication appliances such as voice-controlled devices, hearing aids, and hands-free telephones often suffer from poor speech quality due to background noise and room reverberation. Multiple microphone techniques such as beamformers can improve the speech quality and intelligibility by exploiting the spatial diversity of speech and noise sources. Upon these techniques, one can differentiate between fixed and adaptive beamformers. A fixed beamformer combines the noisy signals by a time-invariant filter-and-sum operation. The filters can be designed to achieve constructive superposition towards a desired direction (delay-and-sum beamformer) or in order to maximize the SNR improvement (superdirective beamformer) [1, 2, 3]. Adaptive beamformers commonly consist of a fixed beamformer towards a fixed desired direction and an adaptive null steering towards moving interfering sources [4, 5].
All beamformer techniques assume the target direction of arrival (DOA) to be known a priori or assume that it can be estimated sufficiently enough. Usually the performance of such a beamforming system decreases dramatically if the DOA knowledge is erroneous. To estimate the DOA during runtime, time difference of arrival (TDOA)-based locators evaluate the maximum of a weighted cross correlation [6, 7]. Subspace methods have the ability to detect multiple sources by decomposing the spatial covariance matrix into a signal and a noise subspace. However, the performance of all DOA estimation algorithms suffers severely f
Data Loading...