Underdetermined Blind Source Separation in Echoic Environments Using DESPRIT

  • PDF / 2,000,592 Bytes
  • 19 Pages / 600.03 x 792 pts Page_size
  • 15 Downloads / 190 Views

DOWNLOAD

REPORT


Research Article Underdetermined Blind Source Separation in Echoic Environments Using DESPRIT Thomas Melia and Scott Rickard Sparse Signal Processing Group, University College Dublin, Belfield, Dublin 4, Ireland Received 1 October 2005; Revised 4 April 2006; Accepted 27 May 2006 Recommended by Andrzej Cichocki The DUET blind source separation algorithm can demix an arbitrary number of speech signals using M = 2 anechoic mixtures of the signals. DUET however is limited in that it relies upon source signals which are mixed in an anechoic environment and which are sufficiently sparse such that it is assumed that only one source is active at a given time frequency point. The DUET-ESPRIT (DESPRIT) blind source separation algorithm extends DUET to situations where M ≥ 2 sparsely echoic mixtures of an arbitrary number of sources overlap in time frequency. This paper outlines the development of the DESPRIT method and demonstrates its properties through various experiments conducted on synthetic and real world mixtures. Copyright © 2007 T. Melia and S. Rickard. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1.

INTRODUCTION

The “cocktail party phenomenon” illustrates the ability of the human auditory system to separate out a single speech source from the cacophony of a crowded room using only two sensors with no prior knowledge of the speakers or the channel presented by the room. Efforts to implement a receiver which emulates this sophistication are referred to as blind source separation techniques [1–3]. The DUET blind source separation method [4] can demix an arbitrary number of speech source signals given just 2 anechoic mixtures of the sources, providing that the time-frequency representations of the sources do not overlap. The technique is limited in the following respects. (1) It is not obvious how to best extend the technique to a situation where more mixtures are available. (2) The assumption that only one source is active at a given time-frequency point is limiting, especially when M > 2 mixtures may be available. (3) The anechoic mixing model clearly restricts the types of environments where DUET can be applied. A number of extensions to the DUET blind source separation method have recently been proposed [5–7] that address these issues. In this paper we summarise and characterise the performance of these extensions, which we believe embody the natural multichannel, echoic extension of DUET. Other authors have proposed different DUET extensions, for

example, [8–11] describe multichannel extensions to DUET when M ≥ 2 mixtures are available. It is recognised in [9– 15] that the assumption that only one source is active at a given time-frequency point is quite a harsh restriction to place upon large numbers of speech sources and weakened forms of this assumption are presented in these papers. An echoic extension to DUET is demonstrated in [9] when the m