Sector-Based Detection for Hands-Free Speech Enhancement in Cars

  • PDF / 1,490,132 Bytes
  • 15 Pages / 600.03 x 792 pts Page_size
  • 33 Downloads / 190 Views

DOWNLOAD

REPORT


Sector-Based Detection for Hands-Free Speech Enhancement in Cars ¨ Guillaume Lathoud,1, 2 Julien Bourgeois,3 and Jurgen Freudenberger3 1 IDIAP

Research Institute, 1920 Martigny, Switzerland Polytechnique F´ed´erale de Lausanne (EPFL), 1015 Lausanne, Switzerland 3 DaimlerChrysler Research and Technology, 89014 Ulm, Germany 2 Ecole ´

Received 31 January 2005; Revised 20 July 2005; Accepted 22 August 2005 Adaptation control of beamforming interference cancellation techniques is investigated for in-car speech acquisition. Two efficient adaptation control methods are proposed that avoid target cancellation. The “implicit” method varies the step-size continuously, based on the filtered output signal. The “explicit” method decides in a binary manner whether to adapt or not, based on a novel estimate of target and interference energies. It estimates the average delay-sum power within a volume of space, for the same cost as the classical delay-sum. Experiments on real in-car data validate both methods, including a case with 100 km/h background road noise. Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.

1.

INTRODUCTION

Speech-based command interfaces are becoming more and more common in cars, for example in automatic dialog systems for hands-free phone calls and navigation assistance. The automatic speech recognition performance is crucial, and can be greatly hampered by interferences such as speech from a codriver. Unfortunately, spontaneous multiparty speech contains lots of overlaps between participants [1]. A directional microphone oriented towards the driver provides an immediate hardware enhancement by lowering the energy level of the codriver interference. In the Mercedes S320 setup used in this article, a 6 dB relative difference is achieved (value measured in the car). However, an additional software improvement is required to fully cancel the codriver’s interference, for example, with adaptive techniques. They consist in a time-varying linear filter that enhances the signal-to-interference ratio (SIR), as depicted by Figure 1. Many beamforming algorithms have been proposed, with various degrees of relevance in the car environment [2]. Apart from differential array designs, superdirective beamformers [3] derived from the minimum variance distortionless response principle (MVDR) apply well to our hardware setup, such as the generalized sidelobe canceller (GSC) structure. The original adaptive versions assume a fixed, known acoustic propagation channel. This is rarely the case in prac-

tice, so the target signal is reduced at the beamformer output. A solution is to adapt, only when the interferer is dominant, by varying the adaptation speed in a binary manner (explicit control), or in a continuous manner (implicit control). Existing explicit methods detect when the target is dom in (t), inant by thresholding an estimate of the input SIR, SIR or a related quantity. During those periods, adaptation is stopped [4] or the acoustic channel is tracked [5, 6] (and  in (t) related self-calibration algo