Model-Based Speech Signal Coding Using Optimized Temporal Decomposition for Storage and Broadcasting Applications

  • PDF / 877,793 Bytes
  • 11 Pages / 600 x 792 pts Page_size
  • 74 Downloads / 173 Views

DOWNLOAD

REPORT


Model-Based Speech Signal Coding Using Optimized Temporal Decomposition for Storage and Broadcasting Applications Chandranath R. N. Athaudage ARC Special Research Center for Ultra-Broadband Information Networks (CUBIN), Department of Electrical and Electronic Engineering, The University of Melbourne, Victoria 3010, Australia Email: [email protected]

Alan B. Bradley Institution of Engineers Australia, North Melbourne, Victoria 3051, Australia Email: [email protected]

Margaret Lech School of Electrical and Computer System Engineering, Royal Melbourne Institute of Technology (RMIT) University, Melbourne, Victoria 3001, Australia Email: [email protected] Received 27 May 2002 and in revised form 17 March 2003 A dynamic programming-based optimization strategy for a temporal decomposition (TD) model of speech and its application to low-rate speech coding in storage and broadcasting is presented. In previous work with the spectral stability-based event localizing (SBEL) TD algorithm, the event localization was performed based on a spectral stability criterion. Although this approach gave reasonably good results, there was no assurance on the optimality of the event locations. In the present work, we have optimized the event localizing task using a dynamic programming-based optimization strategy. Simulation results show that an improved TD model accuracy can be achieved. A methodology of incorporating the optimized TD algorithm within the standard MELP speech coder for the efficient compression of speech spectral information is also presented. The performance evaluation results revealed that the proposed speech coding scheme achieves 50%–60% compression of speech spectral information with negligible degradation in the decoded speech quality. Keywords and phrases: temporal decomposition, speech coding, spectral parameters, dynamic programming, quantization.

1.

INTRODUCTION

While practical issues such as delay, complexity, and fixed rate of encoding are important for speech coding applications in telecommunications, they can be significantly relaxed for speech storage applications such as store-forward messaging and broadcasting systems. In this context, it is desirable to know what optimal compression performance is achievable if associated constraints are relaxed. Various techniques for compressing speech information exploiting the delay domain, for applications where delay does not need to be strictly constrained (in contrast to full-duplex conversational communication), are found in the literature [1, 2, 3, 4, 5]. However, only very few have addressed the issue from an optimization perspective. Specifically, temporal decomposition (TD) [6, 7, 8, 9, 10, 11], which is very

effective in representing the temporal structure of speech and for removing temporal redundancies, has not been given adequate treatment for optimal performance to be achieved. Such an optimized TD (OTD) algorithm would be useful for speech coding applications such as voice store-forward messaging systems, and multimedia voice-output systems,