A Supervised Classification Algorithm for Note Onset Detection
- PDF / 1,994,475 Bytes
- 13 Pages / 600.03 x 792 pts Page_size
- 109 Downloads / 211 Views
Research Article A Supervised Classification Algorithm for Note Onset Detection Alexandre Lacoste and Douglas Eck Department of Computer Science, University of Montreal, Montreal, QC, Canada H3T 1J4 Received 5 December 2005; Revised 9 August 2006; Accepted 26 August 2006 Recommended by Ichiro Fujinaga This paper presents a novel approach to detecting onsets in music audio files. We use a supervised learning algorithm to classify spectrogram frames extracted from digital audio as being onsets or nononsets. Frames classified as onsets are then treated with a simple peak-picking algorithm based on a moving average. We present two versions of this approach. The first version uses a single neural network classifier. The second version combines the predictions of several networks trained using different hyperparameters. We describe the details of the algorithm and summarize the performance of both variants on several datasets. We also examine our choice of hyperparameters by describing results of cross-validation experiments done on a custom dataset. We conclude that a supervised learning approach to note onset detection performs well and warrants further investigation. Copyright © 2007 Hindawi Publishing Corporation. All rights reserved.
1.
INTRODUCTION
This paper is concerned with finding the onset times of notes in music audio. Though conceptually simple, this task is deceivingly difficult to perform automatically with a computer. Consider, for example, the na¨ıve approach of finding amplitude peaks in the raw waveform. This strategy fails except for trivially easy cases such as monophonic percussive instruments. At the same time, onset detection is implicated in a number of important music information retrieval (MIR) tasks, and thus warrants research. Onset detection is useful in the analysis of temporal structure in music such as tempo identification and meter identification. Music classification and music fingerprinting are two other relevant areas where onset detection can play a role. In the case of classification, onset locations could be used to significantly reduce the number of frame-level features retained. For example, a sampling method could be used that preferentially selects from frames near-predicted onset locations. A related segmentation strategy for genre classification was used by West and Cox [1]. In the case of music fingerprinting, onset times could be used as the basis of a robust fingerprint vector. Onset detection is also important in areas involving the structured representation of music. For example, music editing (performed using, e.g., a sequencer) can be simplified by using automatic onset detection to segment a waveform into logical parts. Also, onset detection is fundamentally
important for the problem of automatic music transcription, where a structured symbolic representation (usually a traditional music score) is inferred from a waveform. Onsets detection algorithms can generally be divided into three steps: (1) transformation of the waveform to isolate different frequency bands, in
Data Loading...