Bandwidth Extension of Telephone Speech Aided by Data Embedding

  • PDF / 955,801 Bytes
  • 16 Pages / 600.03 x 792 pts Page_size
  • 0 Downloads / 141 Views

DOWNLOAD

REPORT


Research Article Bandwidth Extension of Telephone Speech Aided by Data Embedding Ariel Sagi and David Malah Department of Electrical Engineering, Technion - Israel Institute of Technology, Haifa 32000, Israel Received 18 February 2006; Revised 19 July 2006; Accepted 10 September 2006 Recommended by Tan Lee A system for bandwidth extension of telephone speech, aided by data embedding, is presented. The proposed system uses the transmitted analog narrowband speech signal as a carrier of the side information needed to carry out the bandwidth extension. The upper band of the wideband speech is reconstructed at the receiving end from two components: a synthetic wideband excitation signal, generated from the narrowband telephone speech and a wideband spectral envelope, parametrically represented and transmitted as embedded data in the telephone speech. We propose a novel data embedding scheme, in which the scalar Costa scheme is combined with an auditory masking model allowing high rate transparent embedding, while maintaining a low bit error rate. The signal is transformed to the frequency domain via the discrete Hartley transform (DHT) and is partitioned into subbands. Data is embedded in an adaptively chosen subset of subbands by modifying the DHT coefficients. In our simulations, high quality wideband speech was obtained from speech transmitted over a telephone line (characterized by spectral magnitude distortion, dispersion, and noise), in which side information data is transparently embedded at the rate of 600 information bits/second and with a bit error rate of approximately 3 · 10−4 . In a listening test, the reconstructed wideband speech was preferred (at different degrees) over conventional telephone speech in 92.5% of the test utterances. Copyright © 2007 A. Sagi and D. Malah. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1.

INTRODUCTION

Public telephone systems reduce the bandwidth of the transmitted speech signal from an effective frequency range of 50 Hz to 7 KHz to the range of 300 Hz to 3.4 KHz. The reduced bandwidth leads to a characteristic thin and muffled sound of the so-called telephone speech. Listening tests have shown that the speech bandwidth affects the perceived speech quality [1]. Artificially extending the bandwidth of the narrowband (NB) speech signal can result in both higher intelligibility and higher subjective quality of the reconstructed wideband (WB) speech. Usually, the information required for speech bandwidth extension (SBE) [2] is generated from the received NB speech or transmitted separately. Typically, the latter method results in higher quality of the reconstructed WB speech. A unique SBE system in which the transmission from and to the talker’s handset is analog, and hence particularly suitable for the public telephone system, is suggested in this paper. The proposed scheme uses the speech signal as a carrier of