Using Pitch, Amplitude Modulation, and Spatial Cues for Separation of Harmonic Instruments from Stereo Music Recordings
- PDF / 1,080,485 Bytes
- 10 Pages / 600.03 x 792 pts Page_size
- 21 Downloads / 180 Views
Research Article Using Pitch, Amplitude Modulation, and Spatial Cues for Separation of Harmonic Instruments from Stereo Music Recordings John Woodruff1 and Bryan Pardo2 1 Music
Technology Program, School of Music, Northwestern University, Evanston, IL 60208, USA of Electrical Engineering and Computer Science, Northwestern University, Evanston, IL 60208, USA
2 Department
Received 2 December 2005; Revised 30 July 2006; Accepted 10 September 2006 Recommended by Masataka Goto Recent work in blind source separation applied to anechoic mixtures of speech allows for improved reconstruction of sources that rarely overlap in a time-frequency representation. While the assumption that speech mixtures do not overlap significantly in time-frequency is reasonable, music mixtures rarely meet this constraint, requiring new approaches. We introduce a method that uses spatial cues from anechoic, stereo music recordings and assumptions regarding the structure of musical source signals to effectively separate mixtures of tonal music. We discuss existing techniques to create partial source signal estimates from regions of the mixture where source signals do not overlap significantly. We use these partial signals within a new demixing framework, in which we estimate harmonic masks for each source, allowing the determination of the number of active sources in important timefrequency frames of the mixture. We then propose a method for distributing energy from time-frequency frames of the mixture to multiple source signals. This allows dealing with mixtures that contain time-frequency frames in which multiple harmonic sources are active without requiring knowledge of source characteristics. Copyright © 2007 Hindawi Publishing Corporation. All rights reserved.
1.
INTRODUCTION
Source separation is the process of determining individual source signals, given only mixtures of the source signals. When prior analysis of the individual sound sources is not possible, the problem is considered blind source separation (BSS). In this work, we focus on the BSS problem as it relates to recordings of music. A tool that can accomplish blind separation of musical mixtures would be of use to recording engineers, composers, multimedia producers, and researchers. Accurate source separation would be of great utility in many music information retrieval tasks, such as music transcription, vocalist and instrument identification, and melodic comparison of polyphonic music. Source separation would also facilitate post production of preexisting recordings, sample-based musical composition, multichannel expansion of mono and stereo recordings, and structured audio coding. The following section contains a discussion of related work in source separation, with an emphasis on current work in music source separation. In Section 3 we present a new source separation approach, designed to isolate multiple
simultaneous instruments from an anechoic, stereo mixture of tonal music. The proposed method incorporates existing statistical BSS techniques and perceptually signif
Data Loading...