Speech intelligibility enhancement: a hybrid wiener approach

  • PDF / 1,425,765 Bytes
  • 9 Pages / 595.276 x 790.866 pts Page_size
  • 29 Downloads / 209 Views

DOWNLOAD

REPORT


Speech intelligibility enhancement: a hybrid wiener approach V. Srinivasarao1 · Umesh Ghanekar1 Received: 6 May 2020 / Accepted: 8 July 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Speech enhancement primarily focuses on improving the intelligibility and quality of the speech signal by using various algorithms and techniques. Processing of a speech signal refers to applying efficient mechanisms to reduce noise in the way of extracting the intended speech signal from the corrupted signal. Noise reduction techniques such as kalman filtering, spectral subtraction and adaptive wiener filtering etc. are used in different enhancement scenarios in speech processing. In the proposed method, the combination of wiener filter and Karhunen–Loéve Transform is used to remove noise and enhance the noisy speech signal. This paper presents the performance evaluation of the proposed hybrid algorithm by estimating Signal to Noise Ratio, Perceptual Evaluation of Speech Quality, Short-Time Objective Intelligibility and Extended STOI values. This algorithm has been implemented in varied noisy conditions and the results proved the fruitfulness of this method. Subjective listening evaluation is also done and both the objective and subjective results confirmed the significant improvement in speech intelligibility in the proposed method. Keywords  Speech intelligibility · Speech enhancement · Noise reduction · Wiener filter · Hybrid algorithm · Signal to noise ratio

1 Introduction In the processing of speech signals, speech enhancement is a wide area of research. The aim of any speech enhancement technique or algorithm is to reduce the noise in noisy speech. Noise can be, in general, broadband or narrowband, stationary or non-stationary, additive or multiplicative (Manohar and Rao 2005). Major part of research in this area focuses on broadband, additive and stationary noise. In speech signal processing, due to distortions or interference from various sources, quality may get affected. Quantization noise, amplifier noise, acoustic noise, background noise etc. are the typical sources of disturbances. Different algorithms used previously reduced the speech distortion under varying noisy conditions which was even confirmed by the subjective listening tests. Still there is an issue in filtering the noise in the environment especially in an * V. Srinivasarao [email protected] Umesh Ghanekar [email protected] 1



ECE Department, National Institute of Technology, Kurukshetra, Haryana, India

organization. So, speech intelligibility improvement in varying room boundary conditions is to be addressed (Krishnamurthy and Hansen 2009). In the environment that is noisy, before getting corrupted by the noise, the pure speech signal is to be modified with the help of signal processing (Akbacak and Hansen 2007). Previous research findings noted that a good speech enhancer with a good PESQ performance delivers poor word error ratio performance in recognition. The improvement in PESQ cannot be transferred directly