Neuromimetic Sound Representation for Percept Detection and Manipulation

PDF / 3,194,081 Bytes
15 Pages / 600 x 792 pts Page_size
68 Downloads / 165 Views

Neuromimetic Sound Representation for Percept Detection and Manipulation Dmitry N. Zotkin Perceptual Interfaces and Reality Laboratory, Institute for Advanced Computer Studies (UMIACS), University of Maryland, College Park, MD 20742, USA Email: [email protected]

Taishih Chi Neural Systems Laboratory, The Institute for Systems Research, University of Maryland, College Park, MD 20742, USA Email: [email protected]

Shihab A. Shamma Neural Systems Laboratory, The Institute for Systems Research, University of Maryland, College Park, MD 20742, USA Email: [email protected]

Ramani Duraiswami Perceptual Interfaces and Reality Laboratory, Institute for Advanced Computer Studies (UMIACS), University of Maryland, College Park, MD 20742, USA Email: [email protected] Received 2 November 2003; Revised 4 August 2004 The acoustic wave received at the ears is processed by the human auditory system to separate diﬀerent sounds along the intensity, pitch, and timbre dimensions. Conventional Fourier-based signal processing, while endowed with fast algorithms, is unable to easily represent a signal along these attributes. In this paper, we discuss the creation of maximally separable sounds in auditory user interfaces and use a recently proposed cortical sound representation, which performs a biomimetic decomposition of an acoustic signal, to represent and manipulate sound for this purpose. We briefly overview algorithms for obtaining, manipulating, and inverting a cortical representation of a sound and describe algorithms for manipulating signal pitch and timbre separately. The algorithms are also used to create sound of an instrument between a “guitar” and a “trumpet.” Excellent sound quality can be achieved if processing time is not a concern, and intelligible signals can be reconstructed in reasonable processing time (about ten seconds of computational time for a one-second signal sampled at 8 kHz). Work on bringing the algorithms into the real-time processing domain is ongoing. Keywords and phrases: anthropomorphic algorithms, pitch detection, human sound perception.

1.

INTRODUCTION

When a natural sound source such as a human voice or a musical instrument produces a sound, the resulting acoustic wave is generated by a time-varying excitation pattern of a possibly time-varying acoustical system, and the sound characteristics depend both on the excitation signal and on the production system. The production system (e.g., human vocal tract, the guitar box, or the flute tube) has its own characteristic response. Varying the excitation parameters produces a sound signal that has diﬀerent frequency components, but still retains perceptual characteristics that uniquely identify the production instrument (identity of the person, type of instrument—piano, violin, etc.), and even the specific type

of piano on which it was produced. When one is asked to characterize this sound source using descriptions based on Fourier analysis, one discovers that concepts such as frequency and amplitude are insuﬃcient to explain such perceptual characteri

Data Loading...

Neuromimetic Sound Representation for Percept Detection and Manipulation

Recommend Documents

Deep Detection for Face Manipulation

Perception/Percept

Enriched Feature Representation and Combination for Deep Saliency Detection

Face Manipulation Detection via Auxiliary Supervision

Sparse and collaborative representation-based anomaly detection

Distance-Normalized Unified Representation for Monocular 3D Object Detection

Cross-Task Representation Learning for Anatomical Landmark Detection

Sound

A Visual Approach for Green Criminology Exploring the Social Percept

Wearable Technology for Robotic Manipulation and Learning

Algebraic manipulation detection codes via highly nonlinear functions

Click Event Sound Detection Using Machine Learning in Automotive Industry