An Efficient VAD Based on a Generalized Gaussian PDF

The emerging applications of wireless speech communication are demanding increasing levels of performance in noise adverse environments together with the design of high response rate speech processing systems. This is a serious obstacle to meet the demand

PDF / 668,133 Bytes
9 Pages / 430 x 660 pts Page_size
14 Downloads / 338 Views

DOWNLOAD

REPORT

act. The emerging applications of wireless speech communication are demanding increasing levels of performance in noise adverse environments together with the design of high response rate speech processing systems. This is a serious obstacle to meet the demands of modern applications and therefore these systems often needs a noise reduction algorithm working in combination with a precise voice activity detector (VAD). This paper presents a new voice activity detector (VAD) for improving speech detection robustness in noisy environments and the performance of speech recognition systems. The algorithm deﬁnes an optimum likelihood ratio test (LRT) involving Multiple and correlated Observations (MCO). An analysis of the methodology for N = {2, 3} shows the robustness of the proposed approach by means of a clear reduction of the classiﬁcation error as the number of observations is increased. The algorithm is also compared to diﬀerent VAD methods including the G.729, AMR and AFE standards, as well as recently reported algorithms showing a sustained advantage in speech/non-speech detection accuracy and speech recognition performance.

1

Introduction

The emerging applications of speech communication are demanding increasing levels of performance in noise adverse environments. Examples of such systems are the new voice services including discontinuous speech transmission [1,2,3] or distributed speech recognition (DSR) over wireless and IP networks [4]. These systems often require a noise reduction scheme working in combination with a precise voice activity detector (VAD) [5] for estimating the noise spectrum during non-speech periods in order to compensate its harmful eﬀect on the speech signal. During the last decade numerous researchers have studied diﬀerent strategies for detecting speech in noise and the inﬂuence of the VAD on the performance of speech processing systems [5]. Sohn et al. [6] proposed a robust VAD algorithm based on a statistical likelihood ratio test (LRT) involving a single observation vector. Later, Cho et al [7] suggested an improvement based on a smoothed LRT. Most VADs in use today normally consider hangover algorithms based on empirical models to smooth the VAD decision. It has been shown recently [8,9] that incorporating long-term speech information to the decision rule reports M. Chetouani et al. (Eds.): NOLISP 2007, LNAI 4885, pp. 246–254, 2007. c Springer-Verlag Berlin Heidelberg 2007

An Eﬃcient VAD Based on a Generalized Gaussian PDF

247

beneﬁts for speech/pause discrimination in high noise environments, however an important assumption made on these previous works has to be revised: the independence of overlapped observations. In this work we propose a more realistic one: the observations are jointly gaussian distributed with non-zero correlations. In addition, important issues that need to be addressed are: i) the increased computational complexity mainly due to the deﬁnition of the decision rule over large data sets, and ii) the optimum criterion of the decision rule. This work advanc

Data Loading...

An Efficient VAD Based on a Generalized Gaussian PDF

Recommend Documents

Generalized Gaussian Error Calculus

An efficient karst fracture seepage path construction algorithm based on a generalized disk model

VAD Based on Kernel Smoothed Function of EGARCH Models

PDF

An efficient algorithm for solving the generalized trust region subproblem

A note on generalized averaged Gaussian formulas for a class of weight functions

Efficient HSS-based preconditioners for generalized saddle point problems

Detailed Clustering Based on Gaussian Mixture Models

An Efficient Quantum Private Comparison Protocol Based on Conjugate Coding

An Efficient Small Traffic Sign Detection Method Based on YOLOv3

Efficient image encryption scheme based on generalized logistic map for real time image processing

An Efficient Privacy Scheme Based on Smart Multimedia Devices