Using Intermicrophone Correlation to Detect Speech in Spatially Separated Noise
- PDF / 1,129,277 Bytes
- 14 Pages / 600.03 x 792 pts Page_size
- 90 Downloads / 187 Views
Using Intermicrophone Correlation to Detect Speech in Spatially Separated Noise Ashish Koul1 and Julie E. Greenberg2 1 Broadband
Video Compression Group, Broadcom Corporation, Andover, MA 01810, USA Institute of Technology, 77 Massachusetts Avenue, Room E25-518, Cambridge, MA 02139-4307, USA
2 Massachusetts
Received 29 April 2004; Revised 20 April 2005; Accepted 25 April 2005 This paper describes a system for determining intervals of “high” and “low” signal-to-noise ratios when the desired signal and interfering noise arise from distinct spatial regions. The correlation coefficient between two microphone signals serves as the decision variable in a hypothesis test. The system has three parameters: center frequency and bandwidth of the bandpass filter that prefilters the microphone signals, and threshold for the decision variable. Conditional probability density functions of the intermicrophone correlation coefficient are derived for a simple signal scenario. This theoretical analysis provides insight into optimal selection of system parameters. Results of simulations using white Gaussian noise sources are in close agreement with the theoretical results. Results of more realistic simulations using speech sources follow the same general trends and illustrate the performance achievable in practical situations. The system is suitable for use with two microphones in mild-to-moderate reverberation as a component of noise-reduction algorithms that require detecting intervals when a desired signal is weak or absent. Copyright © 2006 A. Koul and J. E. Greenberg. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1.
INTRODUCTION
Conventional hearing aids do not selectively attenuate background noise, and their inability to do so is a common complaint of hearing-aid users [1–4]. Researchers have proposed a variety of speech-enhancement and noise-reduction algorithms to address this problem. Many of these algorithms require identification of intervals when the desired speech signal is weak or absent, so that particular noise characteristics can be estimated accurately [5–7]. Systems that perform this function are referred to by a number of terms, including voice activity detectors, speech detectors, pause detectors, and double-talk detectors. Speech pause detectors are not limited to use in hearing-aid algorithms. They are used in a number of applications including speech recognition [8, 9], mobile telecommunications [10, 11], echo cancellation [12], and speech coding [13]. In some cases, noise-reduction algorithms are initially developed and evaluated using information about the timing of speech pauses derived from the clean signal, which is possible in computer simulations but not in a practical device. Marzinzik and Kollmeier [11] point out that speech pause detectors “are a very sensitive and often limiting part of systems for the reduction of additive noise in
Data Loading...