Speech Compression

The speech signal is usually sampled with the sampling frequency of 8,000 Hz. If the uniform quantization of \(8\)  bits/sample is used, \(64{,}000\)  bits are required for 1 s speech data for the sampling frequency of \(8{,}000\)  Hz. The redundancy in t

  • PDF / 950,284 Bytes
  • 16 Pages / 439.37 x 666.142 pts Page_size
  • 101 Downloads / 164 Views

DOWNLOAD

REPORT


Speech Compression

Abstract The speech signal is usually sampled with the sampling frequency of 8,000 Hz. If the uniform quantization of 8 bits/sample is used, 64,000 bits are required for 1 s speech data for the sampling frequency of 8,000 Hz. The redundancy in the speech signal is exploited to achieve to the lowest of 3,000 bits for 1 s data. This is known as digital Speech compression. The quality of the speech signal comes down by doing compression. The various techniques like nonuniform quantization, adaptive differential pulse code modulation, code exited linear prediction etc., to compress the speech data are discussed in this chapter. Also the methodology to measure the quality of the compressed speech signal is also discussed in this chapter.

4.1 Uniform Quantization A Let the amplitude of the speech signal is in the range −A 2 to 2 . If the range is divided into finite number of levels (say N ), the quantization step is given as δ = NA . If the actual sample value is in between nδ to (n + 21 )δ, the sample value is quantized to nδ. Similarly, if the actual sample value is in between (n + 21 )δ to (n + 1)δ, the sample value is quantized to (n + 1)δ. In either cases, the quantization error is in between −δ δ 2 to 2 and let it be uniformly distributed. The average quantization noise power introduced due to quantization is computed as follows. δ

δ

2 E(e2 ) =

e2 f (e)de = −δ 2

1 δ

2 e2 de = −δ 2

δ2 12

(4.1)

The number of bits used for the speech signal with the uniform quantization of step size δ is given as log2 (N ).

E. S. Gopi, Digital Speech Processing Using Matlab, Signals and Communication Technology, DOI: 10.1007/978-81-322-1677-3_4, © Springer India 2014

135

136

4 Speech Compression

4.2 Nonuniform Quantization Let the amplitude of the speech signal ranges from tization is performed as described below:

−A 2

to

A 2.

The nonuniform quan-

1. If the actual value is ranging from −A 2 to p1 , assign the sample value as q1 . 2. If the actual value is ranging from p1 to p2 , assign the sample value as them to q2 . 3. In general, if the actual value is ranging from pi to pi+1 , assign the sample value as qi+1 . 4. If the actual value is ranging from pn−1 to pn = A2 , assign the sample value as qn . 5. Note that the values of pi are chosen such that the average error is zero for every interval. (i.e, pi = qi +q2 i+1 ). The optimal values for q1 . . . qn are selected such that average squared quantization noise is reduced as described below. The quantization error is obtained as δ(x) = (q1 − x) if x ranges from −A 2 to p1 , δ(x) = (q2 − x) if x ranges from p1 to p2 and so on. Hence the quantization error δ is treated as the function of x. The average squared quantization noise is computed as follows, where p(x) is the probability density function of the speech signal. Usually, probability density function of the speech signal is modeled as gaussian. A

2 E(δ(x)2 ) =

δ(x)2 p(x)dx

(4.2)

−A 2

=

i=n pi 

p(x)(qi − x)2

(4.3)

i=1 pi−1

To obtain the optimal values of qi that minimize E(δ