Towards Structural Analysis of Audio Recordings in the Presence of Musical Variations

  • PDF / 2,524,612 Bytes
  • 18 Pages / 600.03 x 792 pts Page_size
  • 71 Downloads / 158 Views

DOWNLOAD

REPORT


Research Article Towards Structural Analysis of Audio Recordings in the Presence of Musical Variations ¨ Meinard Muller and Frank Kurth Department of Computer Science III, University of Bonn, R¨omerstraße 164, 53117 Bonn, Germany Received 1 December 2005; Revised 24 July 2006; Accepted 13 August 2006 Recommended by Ichiro Fujinaga One major goal of structural analysis of an audio recording is to automatically extract the repetitive structure or, more generally, the musical form of the underlying piece of music. Recent approaches to this problem work well for music, where the repetitions largely agree with respect to instrumentation and tempo, as is typically the case for popular music. For other classes of music such as Western classical music, however, musically similar audio segments may exhibit significant variations in parameters such as dynamics, timbre, execution of note groups, modulation, articulation, and tempo progression. In this paper, we propose a robust and efficient algorithm for audio structure analysis, which allows to identify musically similar segments even in the presence of large variations in these parameters. To account for such variations, our main idea is to incorporate invariance at various levels simultaneously: we design a new type of statistical features to absorb microvariations, introduce an enhanced local distance measure to account for local variations, and describe a new strategy for structure extraction that can cope with the global variations. Our experimental results with classical and popular music show that our algorithm performs successfully even in the presence of significant musical variations. Copyright © 2007 M. M¨uller and F. Kurth. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1.

INTRODUCTION

Content-based document analysis and efficient audio browsing in large music databases has become an important issue in music information retrieval. Here, the automatic annotation of audio data by descriptive high-level features as well as the automatic generation of crosslinks between audio excerpts of similar musical content is of major concern. In this context, the subproblem of audio structure analysis or, more specifically, the automatic identification of musically relevant repeating patterns in some audio recording has been of considerable research interest; see, for example, [1–7]. Here, the crucial point is the notion of similarity used to compare different audio segments, because such segments may be regarded as musically similar in spite of considerable variations in parameters such as dynamics, timbre, execution of note groups (e.g., grace notes, trills, arpeggios), modulation, articulation, or tempo progression. In this paper, we introduce a robust and efficient algorithm for the structural analysis of audio recordings, which can cope with significant variations in the parameters mentioned above including lo