On Building Immersive Audio Applications Using Robust Adaptive Beamforming and Joint Audio-Video Source Localization

  • PDF / 1,679,410 Bytes
  • 12 Pages / 600.03 x 792 pts Page_size
  • 45 Downloads / 254 Views

DOWNLOAD

REPORT


On Building Immersive Audio Applications Using Robust Adaptive Beamforming and Joint Audio-Video Source Localization ´ ´ J. A. Beracoechea, S. Torres-Guijarro, L. Garc´ıa, and F. J. Casajus-Quir os Departamento de Se˜nales, Sistemas y Radiocomunicaciones, Universidad Polit´ecnica de Madrid, 28040 Madrid, Spain Received 20 December 2005; Revised 26 April 2006; Accepted 11 June 2006 This paper deals with some of the different problems, strategies, and solutions of building true immersive audio systems oriented to future communication applications. The aim is to build a system where the acoustic field of a chamber is recorded using a microphone array and then is reconstructed or rendered again, in a different chamber using loudspeaker array-based techniques. Our proposal explores the possibility of using recent robust adaptive beamforming techniques for effectively estimating the original sources of the emitting room. A joint audio-video localization method needed in the estimation process as well as in the rendering engine is also presented. The estimated source signal and the source localization information drive a wave field synthesis engine that renders the acoustic field again at the receiving chamber. The system performance is tested using MUSHRA-based subjective tests. Copyright © 2006 J. A. Beracoechea et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1.

INTRODUCTION

The history of spatial audio started almost 70 years ago. In a patent filled in 1931 Blumlein [1] described the basics of stereo recording and reproduction which can be considered as the first true spatial audio system. At that time, the possibility of creating “phantom sources” supposed a major breakthrough over monaural systems. Some years later, it was finally determined that the effect of adding more than two channels did not produce so much better results to justify the additional technical and economical efforts [2]. Besides, at that time, it was very difficult and expensive to develop simultaneous recording of many channels so stereophony became the most used sound reproduction system in the world until our days. In the 1970’s some efforts tried to enhance the spatial quality by adding 2 more channels (quadraphony) but the results were so poor that the system was abandoned. Lately, we have seen the development of a number of sound reproduction systems that use even more channels to further increase the spatial sound quality. Originally designed for cinemas, the five-channel stereo (or 5.1) adds 2 surround channels and a center channel to enhance the spatial perception of the listeners. Although well received by industry and general public, results with these systems range from excellent

to poor depending on the recorded material and the way of reproduction. In general, all stereo-based systems suffer from the same problems. First of all, the position of the loudspeakers is very