Moving Sound Source Extraction by Time-Variant Beamforming

We have developed a time-variant beamforming method that can extract sound signals from moving sound sources. It is difficult to recognize moving sound sources due to their amplitude and frequency distortions caused by the fact that the sources themselves

  • PDF / 487,724 Bytes
  • 7 Pages / 430 x 660 pts Page_size
  • 37 Downloads / 178 Views

DOWNLOAD

REPORT


Abstract. We have developed a time-variant beamforming method that can extract sound signals from moving sound sources. It is difficult to recognize moving sound sources due to their amplitude and frequency distortions caused by the fact that the sources themselves are moving. Using our proposed method, the amplitude and frequency distortions of moving sound sources are precisely equalized so that these sound signals can be extracted. Numerical experiments showed that using our method improves moving sound source extraction. Extracting such sounds is important for successful natural human-robot interaction in a real environment because a robot has to recognize various types of sounds and sound sources. Keywords: Beamforming, Time-variant system, Moving sound source.

1

Introduction

For successful human-robot interaction in a real environment, a robot should recognize not only a human voice but also other sounds in the surrounding environment [1]. If we separate sound sources in the environment into moving sound sources and fixed sound sources, the moving sound sources are used for predictions of temporal changes and the fixed sound sources are used for recognition of spacial information. We believe that useful human-robot interaction can be achieved by using both temporal and spacial information obtained from fixed and moving sources. There has been some research related to obtaining temporal and spacial information from fixed sound sources [2]. However, for moving sound sources, there has been less research on obtaining temporal and spacial information from moving sound sources because of problems in extracting the original sound source. It is difficult to extract a moving sound source because the source’s own movement changes the amplitude and frequency of the source sound. We developed a new beamforming method that is able to precisely extract source signals by equalizing these changes caused by the source’s movement. Our proposed method is different to conventional sound-extraction methods in that it digitizes the sound source positions and switches beamforming coefficients depending on these digitized source positions. This means there is no discontinuity K. Satoh et al. (Eds.): JSAI 2007, LNAI 4914, pp. 47–53, 2008. c Springer-Verlag Berlin Heidelberg 2008 

48

H. Nakajima et al.

due to switching beamforming coefficients and it is not necessary to equalize frequency changes due to the Doppler effect and amplitude changes because there is theoretically no errors during extractions. 1.1

Time-Variant System and Time-Variant Convolution

We assume that there is a moving sound source whose position and signal waveform are p(t) and s(t), respectively. The observed signal at position q(t) can be derived from the wave equation for moving sources [3] as  (1) x(t) = s(ts )h(t − ts , p(ts ))dts , where h(t, p(ts )) is the impulse response from the source at position p(ts ) to the observation point. This equation implicates that the output signal x(t) can be calculated if the source signal s(t) and impulse responses h(t, p(ts