Subspace Methods for Multimicrophone Speech Dereverberation

  • PDF / 1,119,552 Bytes
  • 17 Pages / 600 x 792 pts Page_size
  • 86 Downloads / 212 Views

DOWNLOAD

REPORT


Subspace Methods for Multimicrophone Speech Dereverberation Sharon Gannot School of Engineering, Bar-Ilan University, Ramat-Gan 52900, Israel Email: [email protected]

Marc Moonen Department of Electrical Engineering, Katholieke Universiteit Leuven, ESAT-SISTA, Kasteelpark Arenberg 10, B-3001 Heverlee, Belgium Email: [email protected] Received 2 September 2002 and in revised form 14 March 2003 A novel approach for multimicrophone speech dereverberation is presented. The method is based on the construction of the null subspace of the data matrix in the presence of colored noise, using the generalized singular-value decomposition (GSVD) technique, or the generalized eigenvalue decomposition (GEVD) of the respective correlation matrices. The special Silvester structure of the filtering matrix, related to this subspace, is exploited for deriving a total least squares (TLS) estimate for the acoustical transfer functions (ATFs). Other less robust but computationally more efficient methods are derived based on the same structure and on the QR decomposition (QRD). A preliminary study of the incorporation of the subspace method into a subband framework proves to be efficient, although some problems remain open. Speech reconstruction is achieved by virtue of the matched filter beamformer (MFBF). An experimental study supports the potential of the proposed methods. Keywords and phrases: speech dereverberation, subspace methods, subband processing.

1.

INTRODUCTION

In many speech communication applications, the recorded speech signal is subject to reflections on the room walls and other objects on its way from the source to the microphones. The resulting speech signal is then called reverberated. The quality of the speech signal might deteriorate severely and this can even cause a degradation in intelligibility. Subsequent processing of the speech signal, such as speech coding or automatic speech recognition, might be rendered useless in the presence of reverberated speech. Although singlemicrophone dereverberation techniques do exist, the most successful methods for dereverberation are based on multimicrophone measurements. Spatiotemporal methods, which are directly applied to the received signals, have been presented by Liu et al. [1] and by Gonzalez-Rodriguez et al. [2]. They consist of a spatial averaging of the minimum-phase component of the speech signal and cepstrum domain processing for manipulating the all-pass component of the speech signal. Other methods use the linear prediction residual signal to dereverberate the speech signal [3, 4]. Beamforming methods [5, 6] which use an estimate of the related acoustical transfer functions (ATFs) can reduce

the amount of reverberation, especially if some a priori knowledge of the acoustical transfer is given. The average ATFs of all the microphones prove to be efficient and quite robust to small speaker movements. However, if this information is not available, these methods cannot eliminate the reverberation completely. Hence, we will avoid using the small