Simultaneous Sound Source Localization by Proposed Cuboids Nested Microphone Array Based on Subband Generalized Eigenval

Multiple sound source localization is an important application in speech processing. In this paper, a cuboids nested microphone array (CuNMA) is proposed for sound acquisition. Also, the spatial aliasing is eliminated by the use of this array. Then, the s

  • PDF / 1,023,682 Bytes
  • 10 Pages / 439.37 x 666.142 pts Page_size
  • 31 Downloads / 199 Views

DOWNLOAD

REPORT


Department of Electricity, Universidad Tecnológica Metropolitana, Av. Jose Pedro Alessandri 1242, 7800002 Santiago, Chile {adehghanfirouzabadi,hdurney,msanhueza}@utem.cl 2 Electrical Engineering Department, Pontificia Universidad Católica de Chile, Santiago, Chile [email protected] 3 Electrical Engineering Department, Universidad de Santiago de Chile, Santiago, Chile [email protected] 4 Department of Computing and Industries, Universidad Católica del Maule, 3466706 Talca, Chile [email protected] 5 Department of Electrical Engineering, Universidad de Chile, Santiago, Chile [email protected]

Abstract. Multiple sound source localization is an important application in speech processing. In this paper, a cuboids nested microphone array (CuNMA) is proposed for sound acquisition. Also, the spatial aliasing is eliminated by the use of this array. Then, the subband processing is proposed based on the GammaTone filter bank. In the next, the generalized eigenvalue decomposition (GEVD) algorithm is implemented on all microphone pairs of CuNMA and for each obtained subband of the GammaTone filter bank. In each subband, the standard deviation (SD) is calculated for all direction of arrival (DOA) estimations, and the subbands with improper information are eliminated. Then, the Kmeans clustering with silhouette criteria are implemented on all DOAs for estimating the number of speakers and to allocate the related DOAs for each cluster. The proposed method is compared with steered response power-phase transform (SRP-PHAT), Geometric Projection, and spectral source model-deep neural network (SSM-DNN) on simulated data in noisy and reverberant conditions, which the results show the superiority of the proposed method in comparison with other previous works. Keywords: Sound source localization  Nested microphone array GammaTone filter bank  Subband processing  Clustering

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 A. E. Hassanien et al. (Eds.): AISI 2020, AISC 1261, pp. 816–825, 2021. https://doi.org/10.1007/978-3-030-58669-0_72



Simultaneous Sound Source Localization by Proposed Cuboids

817

1 Introduction Sound source localization is an active and important field in speech signal processing, where many research works were done in this area. There are many applications for sound source localization such as: hearing aid systems [1], robotics [2], videoconferencing [3], etc. Different strategies are utilized for source localization based on the time difference of arrival (TDOA) [4], and energy propagation [5]. The computational complexity is smaller is TDOA-based localization methods but the accuracy is lower in noisy and reverberant conditions. The energy-based methods are slower because of high computational complexity but they have higher accuracy and robustness in undesirable conditions. Some particular algorithms are proposed for direction of arrival (DOA) estimation based on microphone array. The most common methods are estimating signal paramet