A Bit Stream Scalable Speech/Audio Coder Combining Enhanced Regular Pulse Excitation and Parametric Coding

  • PDF / 793,761 Bytes
  • 14 Pages / 600.05 x 792 pts Page_size
  • 40 Downloads / 198 Views

DOWNLOAD

REPORT


Research Article A Bit Stream Scalable Speech/Audio Coder Combining Enhanced Regular Pulse Excitation and Parametric Coding Felip Riera-Palou1, 2 and Albertus C. den Brinker1 1 Philips

Research Laboratories, Digital Signal Processing Group, Prof. Holstlaan 4, 5656 AA Eindhoven, The Netherlands of Mathematics and Informatics, University of the Balearic Islands, Carretera de Valldemossa km 7.5, 07122 Palma de Mallorca, Spain

2 Department

Received 2 October 2006; Revised 16 March 2007; Accepted 29 June 2007 Recommended by Tan Lee This paper introduces a new audio and speech broadband coding technique based on the combination of a pulse excitation coder and a standardized parametric coder, namely, MPEG-4 high-quality parametric coder. After presenting a series of enhancements to regular pulse excitation (RPE) to make it suitable for the modeling of broadband signals, it is shown how pulse and parametric codings complement each other and how they can be merged to yield a layered bit stream scalable coder able to operate at different points in the quality bit rate plane. The performance of the proposed coder is evaluated in a listening test. The major result is that the extra functionality of the bit stream scalability does not come at the price of a reduced performance since the coder is competitive with standardized coders (MP3, AAC, SSC). Copyright © 2007 F. Riera-Palou and A. C. den Brinker. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1.

INTRODUCTION

During the late eighties and early nineties, and with the explosive growth in the use of Internet, the need for efficient audio representations became more evident and numerous compression methods were proposed. Coders developed within MPEG-2, like MP3 or AAC [1], are popular techniques in use today. Both techniques (MP3, AAC) are examples of lossy coding algorithms where the decoded signal is not a perfect copy of the original material as some information is thrown away during the encoding. Information is discarded by exploiting the characteristics of the human hearing system so as to minimize the audible effects caused by the missing data. Despite these perceptual considerations, these coders aim essentially at a waveform match between the coded and the original signals. More recently, an alternative audio coding paradigm has received substantial attention from the research community. This technique, generically called parametric coding, fits the input signal to a predetermined model simplifying in this way its representation [2, 3]. An example of this type of coder is the sinusoidal coder (SSC) recently introduced by Philips into MPEG-4 as Extension 2 (high-quality parametric coding) [4]. Using SSC, compression factors higher than

50 (24 Kbit/s for a stereo CD stream) have been realized while still maintaining a good quality in the reconstructed signal, although significantly lower than that of th