Semi-Supervised Learning to Enhance Speech Signal for Mobile Communication

  • PDF / 1,598,574 Bytes
  • 10 Pages / 595.276 x 790.866 pts Page_size
  • 90 Downloads / 165 Views

DOWNLOAD

REPORT


ORIGINAL RESEARCH

Semi‑Supervised Learning to Enhance Speech Signal for Mobile Communication Purushotham Ugraiah1 · Chethan Kanakapura Shivabasave Gowda2 Received: 20 August 2020 / Accepted: 14 October 2020 © Springer Nature Singapore Pte Ltd 2020

Abstract Machine learning algorithm to enhance the complex speech signal for mobile communication is one of the research problems in signal processing. The objective of this research paper is to develop a learning algorithm that improves the quality and intelligibility of voice signals that gets are corrupted by real world noise while they are transmitted through the channel. In this paper, we consider a semi-supervised machine learning algorithm for mobile phones that comes with system software to improve SNR of speech signal which is corrupted by manmade disturbance. Most of the disturbances are non-stationary where the effect of noise is non-uniform for all spectral components. In the projected algorithm training, the system is completed with a set of speech and noise data base. System parameters are derived during training process; these parameters are updated as per the disturbance present in the signal. These parameters are used to remove the noise present in speech signal. The obtained results show a substantial progress in SNR by 5–8% as compared to traditional methods. Keywords  Speech enhancement · Semi-supervised · Non-stationary · Machine learning · SNR

Introduction Speech signals are complex in nature and these signals gets corrupted by multiple noises. In voice communication man made noise are harsher than environmental disturbance [1]. These noises are also complex and they are very hard to remove hence the conventional methods of speech

This article is an extended version of our own paper entitled “Speech Enhancement Using Semi-Supervised Learning” presented in 2020 International Conference on Intelligent Engineering and Management (ICIEM) with more detailed view point. This article is part of the topical collection “Computational Statistics” guest edited by Anish Gupta, Mike Hinchey, Vincenzo Puri, Zeev Zalevsky and Wan Abdul Rahim. * Purushotham Ugraiah [email protected] Chethan Kanakapura Shivabasave Gowda [email protected] 1



Department of Electronics and Communication, PES University, Bangalore, India



Department of Electronics and Communication, RVITM, Bangalore, India

2

enhancement (SE) using a single algorithm without any training will not provide the required quality. Some of the Classical methods [2–4] used to remove the additive noise in speech signal are spectral subtraction, Wiener filtering. Non-linear estimation of the signals are done using MMSE and log-MMSE estimator. In recent years, machine learning architectures are found to be successful in speech reorganization. Deep neural networks are used for automatic speech reorganization. Since DNN-based modeling is complex, simplified machine learning algorithms such as semi-supervised learning algorithm can be used for speech enhancement. Semi supervised learning [5] i