Real-time speech enhancement algorithm for transient noise suppression

  • PDF / 3,158,210 Bytes
  • 22 Pages / 439.37 x 666.142 pts Page_size
  • 71 Downloads / 231 Views

DOWNLOAD

REPORT


Real-time speech enhancement algorithm for transient noise suppression Ruiyu Liang 1

1

2

1

& Yue Xie & Jiaming Cheng & Guichen Tang & Shinuo Sun

2

Received: 18 August 2019 / Revised: 16 August 2020 / Accepted: 9 September 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract

To effectively restrain stationary noise and transient noise, a real-time single-channel speech enhancement algorithm is proposed. First, to evaluate stationary noise, the quantile noise estimation method is used to obtain the spectrum of stationary noise. Then, based on the normalized variance and gravity center of the signal, the transient noise detection method is proposed to modify the spectrum of stationary noise. Next, the speech presence probability is estimated based on the speech features and harmonic analysis. Finally, the optimized-modified log-spectral amplitude (OMLSA) estimator is adopted for speech enhancement. The experimental noise contains 115 environmental sounds with the SNR of −10 to 10 dB. The experimental results show that the performance of the proposed algorithm is comparable to the OM-LSA algorithm which has good denoising performance, but the real-time performance of the former is much better. Compared with the Webrtc real-time algorithm, under the overall performance of stationary noise and transient noise, the overall speech quality indicators of the improved algorithm increased by 7.5%, 7.8% and 5.0%, respectively. And the short-time objective intelligibility increased by 2.4%, 2.4% and 2.0%, respectively. Even compared with the recurrent neural network(RNN) algorithm, the suppression performance of the transient noise is better. Besides, the real-time experiment base on the hardware platform shows that the runtime of processing a 10 ms frame is 4.3 ms. Keywords Speech enhancement . Transient noise suppression . The quantile noise estimation . Harmonic analysis

* Ruiyu Liang [email protected]

1

School of Information and Communication Engineering, Nanjing Institute of Technology, Nanjing 211167, China

2

School of Information Science and Engineering, Southeast University, Nanjing 210096, China

Multimedia Tools and Applications

1 Introduction Although speech enhancement algorithms have been studied for decades, they are still a hot topic in the field of speech processing. The early single-channel speech enhancement algorithm mainly studied how to effectively estimate the noise spectrum from noisy speech and thus suppress it. In recent years, with the concept of deep learning [13] and its successful application in the field of speech recognition [7], the speech enhancement algorithm based on supervised learning began to demonstrate its worth [40]. The deep neural network (DNN) [24, 45], convolutional neural network (CNN) [10], long short-term memory(LSTM) network [41, 42], generative adversarial network(GAN) [26, 30] are all used to implement speech enhancement. These supervised learning models demonstrate superior performance over traditional enhancement methods in the ca