Optimization of sound fields reproduction based Higher-Order Ambisonics (HOA) using the Generative Adversarial Network (

  • PDF / 1,630,158 Bytes
  • 16 Pages / 439.642 x 666.49 pts Page_size
  • 94 Downloads / 209 Views

DOWNLOAD

REPORT


Optimization of sound fields reproduction based Higher-Order Ambisonics (HOA) using the Generative Adversarial Network (GAN) Lingkun Zhang1,2 · Xiaochen Wang1,2

· Ruimin Hu1,2 · Dengshi Li1,3 · Weipin Tu1,3

Received: 17 January 2020 / Revised: 3 August 2020 / Accepted: 26 August 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Sound field reproduction using Higher-order Ambisonics (HOA) has many studies in recent years. However, in the HOA, sound fields are reproduced with the least square solution of spherical harmonics (SH) coefficients and not the global sound fields. In this paper, we try to reduce the reproduction error with a data-driven method. As we all known, the Generative Adversarial Networks (GAN) can be used to generate data similar to a data set. With the GAN, the target sound fields are converted to sound fields that can be reproduced accurately in the proposed approach. The data set of target sound fields is updated with the generated fields which have less reproduction error, and thus reproduction errors are reduced. We simulated the performance with four loudspeakers, sound fields of 4 orders SH coefficients are reproduced with GAN and HOA at 1000 Hz, with average reproduction errors of 0.3 and 0.6, respectively. Simulations show that the space between the least-square solution and the optimization solution is reduced with our method. Furthermore, the performances of HOA are optimized. Keywords Spherical harmonics · Loudspeaker array · Sound field reproduction · Generative adversarial network

1 Introduction Sound field reproduction is used to reproduce the full information of a sound field. The information about the sound source and the acoustic environment is given to the listeners with the experience.  Xiaochen Wang

[email protected] 1

National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan, 430072, China

2

Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan University, Wuhan, 430072, China

3

Collaborative Innovation Center of Geospatial Technology, Wuhan, 430079, China

Multimedia Tools and Applications

There are different methods of spatial sound reproduction. On the one hand, there are binaural techniques [43], which deliver a convincing experience over two channels by presenting necessary binaural cues. On the other hand, there are some multichannel sound systems and methods. One of them is the 22.2 multichannel sound system [37], which uses 24 loudspeakers on three layers to produce three-dimensional spatial impressions of sound. And then, some methods reproduce the sound pressures in target points or regions to reproduce the desired sound fields. The first kind is the pressure matching (PM)method [27]. Moreover, the second kind is the model-matching method, including the wave-field synthesis (WFS) [3, 15] and Higher-order Ambisonics(HOA) [2, 25], which typically use a large number of channels to reproduce the wavefront generated from a virtual source. Conti