A Perceptual Model for Sinusoidal Audio Coding Based on Spectral Integration
- PDF / 584,195 Bytes
- 13 Pages / 600 x 792 pts Page_size
- 44 Downloads / 219 Views
A Perceptual Model for Sinusoidal Audio Coding Based on Spectral Integration Steven van de Par Digital Signal Processing Group, Philips Research Laboratories, 5656 AA Eindhoven, The Netherlands Email: [email protected]
Armin Kohlrausch Digital Signal Processing Group, Philips Research Laboratories, 5656 AA Eindhoven, The Netherlands Department of Technology Management, Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands Email: [email protected]
Richard Heusdens Department of Mediamatics, Delft University of Technology, 2600 GA Delft, The Netherlands Email: [email protected]
Jesper Jensen Department of Mediamatics, Delft University of Technology, 2600 GA Delft, The Netherlands Email: [email protected]
Søren Holdt Jensen Department of Communication Technology, Institute of Electronic Systems, Aalborg University, DK-9220 Aalborg, Denmark Email: [email protected] Received 31 October 2003; Revised 22 July 2004 Psychoacoustical models have been used extensively within audio coding applications over the past decades. Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of audio signals. In this paper, we present a new perceptual model that predicts masked thresholds for sinusoidal distortions. The model relies on signal detection theory and incorporates more recent insights about spectral and temporal integration in auditory masking. As a consequence, the model is able to predict the distortion detectability. In fact, the distortion detectability defines a (perceptually relevant) norm on the underlying signal space which is beneficial for optimisation algorithms such as rate-distortion optimisation or linear predictive coding. We evaluate the merits of the model by combining it with a sinusoidal extraction method and compare the results with those obtained with the ISO MPEG-1 Layer I-II recommended model. Listening tests show a clear preference for the new model. More specifically, the model presented here leads to a reduction of more than 20% in terms of number of sinusoids needed to represent signals at a given quality level. Keywords and phrases: audio coding, psychoacoustical modelling, auditory masking, spectral masking, sinusoidal modelling, psychoacoustical matching pursuit.
1.
INTRODUCTION
The ever-increasing growth of application areas such as consumer electronics, broadcasting (digital radio and television), and multimedia/Internet has created a demand for This is an open-access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
high-quality digital audio at low bit rates. Over the last decade, this has led to the development of new coding techniques based on models of human auditory perception (psychoacoustical masking models). Examples include the coding techniques used in the ISO/IE
Data Loading...