Time-Scale Invariant Audio Data Embedding
- PDF / 609,781 Bytes
- 8 Pages / 600 x 792 pts Page_size
- 78 Downloads / 188 Views
Time-Scale Invariant Audio Data Embedding Mohamed F. Mansour Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN 55414, USA Email: [email protected]
Ahmed H. Tewfik Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN 55414, USA Email: [email protected] Received 31 May 2002 and in revised form 22 December 2002 We propose a novel algorithm for high-quality data embedding in audio. The algorithm is based on changing the relative length of the middle segment between two successive maximum and minimum peaks to embed data. Spline interpolation is used to change the lengths. To ensure smooth monotonic behavior between peaks, a hybrid orthogonal and nonorthogonal wavelet decomposition is used prior to data embedding. The possible data embedding rates are between 20 and 30 bps. However, for practical purposes, we use repetition codes, and the effective embedding data rate is around 5 bps. The algorithm is invariant after time-scale modification, time shift, and time cropping. It gives high-quality output and is robust to mp3 compression. Keywords and phrases: data embedding, broadcast monitoring, time-scale invariant, spline interpolation.
1.
INTRODUCTION
In this paper, we introduce a new algorithm for high-capacity data embedding in audio that is suited for marketing, broadcast, and playback monitoring applications. The purpose of broadcast and playback monitoring is primarily to analyze the broadcasted content and collect statistical data to improve the content quality. For this class of applications, the security is not an important issue. However, the embedded data should survive basic operations that the host audio signal may undergo. The most important requirements of a data embedding system are transparency and robustness. Transparency means that there is no perceptual difference between the original and the modified host media. Data embedding techniques usually exploit irrelevancies in digital representation to assure transparency. For audio data embedding, the masking phenomenon is usually exploited to assure that the distortion due to data embedding is imperceptible. Robustness refers to the property that the embedded data should remain in the host media regardless of the signal processing operations that the signal may undergo. The research work in audio watermarking can be classified into two broad classes: spread-spectrum watermarking and projection-based watermarking. In spread-spectrum watermarking, the data is embedded by adding a pseudo random sequence (the watermark) to the audio signal or some features derived from it. An example of spread-spectrum wa-
termarking in the time domain was presented in [1]. The features used for data embedding include the phase of the Fourier coefficients [2], the middle frequency coefficients [3], and the cepstrum coefficients [4]. More complicated structures for spread spectrum watermarking (e.g., [5]) were proposed to synchronize the watermarked signal with the watermark prior to decoding. On th
Data Loading...