Joint Acoustic and Modulation Frequency

  • PDF / 1,172,150 Bytes
  • 8 Pages / 600 x 792 pts Page_size
  • 109 Downloads / 224 Views

DOWNLOAD

REPORT


Joint Acoustic and Modulation Frequency Les Atlas Department of Electrical Engineering, Box 352500, Seattle, WA 98195-2500, USA Email: [email protected]

Shihab A. Shamma Department of Electrical and Computer Engineering and Center for Auditory and Acoustic Research, Institute for Systems Research, University of Maryland, College Park, MD 20742, USA Email: [email protected] Received 30 August 2002 and in revised form 5 February 2003 There is a considerable evidence that our perception of sound uses important features which are related to underlying signal modulations. This topic has been studied extensively via perceptual experiments, yet there are few, if any, well-developed signal processing methods which capitalize on or model these effects. We begin by summarizing evidence of the importance of modulation representations from psychophysical, physiological, and other sources. The concept of a two-dimensional joint acoustic and modulation frequency representation is proposed. A simple single sinusoidal amplitude modulator of a sinusoidal carrier is then used to illustrate properties of an unconstrained and ideal joint representation. Added constraints are required to remove or reduce undesired interference terms and to provide invertibility. It is then noted that the constraints would be also applied to more general and complex cases of broader modulation and carriers. Applications in single-channel speaker separation and in audio coding are used to illustrate the applicability of this joint representation. Other applications in signal analysis and filtering are suggested. Keywords and phrases: Digital signal processing, acoustics, audition, talker separation, modulation spectrum.

1.

INTRODUCTION

Over the last decade, human interfaces with computers have passed through a transition where images, video, and sounds are now fundamental parts of man/machine communications. In the future, machine recognition of images, video, and sound will likely be even more integral to computing. Much progress has been made in the fundamental scientific understanding of human perception and why it is so robust. Our current knowledge of perception has greatly improved the usefulness of information technology. For example, image and music compression techniques owe much of their efficiency to perceptual coding. However, it is easy to see from the large bandwidth gaps between waveform- and structural-based (synthesized) models [1] that there is still room for significant improvement in perceptual understanding and modeling. This paper’s aim is a step in this direction. It proposes to integrate a concept of sensory perception with signal processing methodology to achieve a significant improvement in the representation and coding of acoustic signals. Specifically, we will explore how the auditory perception of very lowfrequency modulations of acoustic energy can be abstracted and mathematically formulated as invertible transforms that will prove to be extremely effective in the coding, modification, and automatic classification of speec