Multiple Scale Music Segmentation Using Rhythm, Timbre, and Harmony
- PDF / 3,078,274 Bytes
- 11 Pages / 600.03 x 792 pts Page_size
- 41 Downloads / 174 Views
Research Article Multiple Scale Music Segmentation Using Rhythm, Timbre, and Harmony Kristoffer Jensen Department of Medialogy, Aalborg University Esbjerg, Niels Bohrs Vej 6, Esbjerg 6700, Denmark Received 30 November 2005; Revised 27 August 2006; Accepted 27 August 2006 Recommended by Ichiro Fujinaga The segmentation of music into intro-chorus-verse-outro, and similar segments, is a difficult topic. A method for performing automatic segmentation based on features related to rhythm, timbre, and harmony is presented, and compared, between the features and between the features and manual segmentation of a database of 48 songs. Standard information retrieval performance measures are used in the comparison, and it is shown that the timbre-related feature performs best. Copyright © 2007 Hindawi Publishing Corporation. All rights reserved.
1.
INTRODUCTION
Segmentation has a perceptual and subjective nature. Manual segmentation can be due to different attributes of music, such as rhythm, timbre, or harmony. Measuring similarity between music segments is a fundamental problem in computational music theory. In this work, automatic music segmentation is performed, based on three different features that are calculated so as to be related to the perception of rhythm, timbre, and harmony. Segmentation of music has many applications such as music information retrieval, copyright infringement resolution, fast music navigation, and repetitive structure finding. In particular, the navigation has been a key motivation in this work, for possible inclusion in the mixxx [1] DJ simulation software. Another possibility is the use of the automatic segmentation for music recomposition [2]. In addition to this, the visualization of the rhythm, timbre, and harmony related features is believed to be a useful tool for computer-aided music analysis. Music segmentation is a popular topic in research today. Several authors have presented segmentation and visualization of music using a self-similarity matrix [3–5] with good results. Foote [5] used a measure of novelty calculated from the selfsimilarity matrix. Cooper and Foote [6] use singular value decomposition on the selfsimilarity matrix for automatic audio summary generation. Jensen [7] optimized the processing cost by using a smoothed novelty measure, calculated on a small square on the diagonal of the selfsimilarity matrix. In [8] short and long features are used for summary
generation using image structuring filters and unsupervised learning. Dannenberg and Hu [9] use ad hoc dynamic programming algorithms on different audio features for identifying patterns in music. Goto [10] detects the chorus section using identification of repeated section on the chroma feature. Other segmentation approaches include informationtheoretic methods [11]. Jehan [12] recently proposed a recursive multiclass approach to the analysis of acoustic similarities in popular music using dynamic programming. A previous work used a model of rhythm, the rhythmogram, to segment popular Chinese music [13]. The rhythmogram
Data Loading...