Very Low Rate Scalable Speech Coding through Classified Embedded Matrix Quantization

  • PDF / 3,880,637 Bytes
  • 13 Pages / 600.05 x 792 pts Page_size
  • 95 Downloads / 162 Views

DOWNLOAD

REPORT


Research Article Very Low Rate Scalable Speech Coding through Classified Embedded Matrix Quantization Ehsan Jahangiri1, 2 and Shahrokh Ghaemmaghami2 1 Department 2 Department

of Electrical & Computer Engineering, Johns Hopkins University, Baltimore, MD 21218, USA of Electrical Engineering, Sharif University of Technology, P.O. Box 14588-89694, Tehran, Iran

Correspondence should be addressed to Ehsan Jahangiri, [email protected] Received 21 June 2009; Revised 2 February 2010; Accepted 19 February 2010 Academic Editor: Soren Jensen Copyright © 2010 E. Jahangiri and S. Ghaemmaghami. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. This paper proposes a scalable speech coding scheme using the embedded matrix quantization of the LSFs in the LPC model. For an efficient quantization of the spectral parameters, two types of codebooks of different sizes are designed and used to encode unvoiced and mixed voicing segments separately. The tree-like structured codebooks of our embedded quantizer, constructed through a cell merging process, help to make a fine-grain scalable speech coder. Using an efficient adaptive dual-band approximation of the LPC excitation, where voicing transition frequency is determined based on the concept of instantaneous frequency in the frequency domain, near natural sounding synthesized speech is achieved. Assessment results, including both overall quality and intelligibility scores show that the proposed coding scheme can be a reasonable choice for speech coding in low bandwidth communication applications.

1. Introduction Scalable speech coding refers to the coding schemes that reconstruct speech at different levels of accuracy or quality at various bit rates. The bit-stream of a scalable coder is composed of two parts: an essential part called the core unit and an optional part that includes enhancement units. The core unit provides minimal quality for the synthesized speech, while a higher quality is achieved by adding the enhancement units. Embedded quantization, which provides the ability of successive refinement of the reconstructed symbols, can be employed in speech coders to attain the scalability property. This quantization method has found useful applications in variable-rate and progressive transmission of digital signals. The output symbol of an i-bit quantizer, in an embedded quantizer, is embedded in all output symbols of the (i + k)bit quantizers, where k ≥ 1 [1]. In other words, higher rate codes contain lower rate codes plus bits of refinement. Embedded quantization was first introduced by Tzou [1] for scalar quantization. Tzou proposed a method to achieve embedded quantization by organizing the threshold levels in the form of binary trees, using the numerical optimization of Max [2]. Subsequently, embedded quantization was

generalized to vector quantization (VQ). Some examples of such vector quantizers, which are