A Matlab Toolbox for Music Information Retrieval

We present MIRToolbox, an integrated set of functions written in Matlab, dedicated to the extraction from audio files of musical features related, among others, to timbre, tonality, rhythm or form. The objective is to offer a state of the art of computati

  • PDF / 481,398 Bytes
  • 8 Pages / 439.37 x 666.142 pts Page_size
  • 30 Downloads / 233 Views

DOWNLOAD

REPORT


tivation and approach MIRToolbox is a Matlab toolbox dedicated to the extraction of musically-related features in audio recordings. It has been designed in particular with the objective of enabling the computation of a large range of features from databases of audio files, that can be applied to statistical analyses. We chose to base the design of the toolbox on Matlab computing environment, as it offers good visualisation capabilities and gives access to a large variety of other toolboxes. In particular, the MIRToolbox makes use of functions available in publicdomain toolboxes such as the Auditory Toolbox (Slaney, 1998), NetLab (Nabney, 2002), or SOMtoolbox (Vesanto, 1999). It appeared that such computational framework, because of its general objectives, could be useful to the research community in Music Information Retrieval (MIR), but also for teaching. For that reason, a particular attention has been paid concerning the ease of use of the toolbox. The functions are called using a simple and adaptive syntax. More expert users can specify a large range of options and parameters.

262

Olivier Lartillot, Petri Toiviainen and Tuomas Eerola

The different musical features extracted from the audio files are highly interdependent: in particular, as can be seen in figure 1, some features are based on same initial computations. In order to improve the computational efficiency, it is important to avoid redundant computations of these common components. Each of these intermediary components, and the final musical features, are therefore considered as building blocks that can be freely articulated one with each other. Besides, in keeping with the objective of optimal ease of use of the toolbox, each building block has been conceived in a way that it can adapt to the type of input data.

2 Feature extraction Figure 1 shows an overview of the main features considered in the toolbox. All the different processes start from the audio signal (on the left) and form a chain of operations developed horizontally rightwise. The vertical disposition of the processes indicates an increasing order of complexity of the operations, from simplest computation (top) to more detailed auditory modelling (bottom). Each musical feature is related to the different broad musical dimensions traditionally defined in music theory. In bold are highlighted features related to pitch, to tonality (chromagram, key strength and key Self-Organising Map, or SOM) and to dynamics (Root Mean Square, or RMS, energy). In bold italics are indicated features related to rhythm: namely tempo, pulse clarity and fluctuation. In simple italics are highlighted a large set of features that can be associated to timbre. Among them, all the operators in grey italics can be in fact applied to many others different representations: for instance, statistical moments such as centroid, kurtosis, etc., can be applied to either spectra, envelopes, but also to any histogram based on any given feature.

!UDIOSIGNAL WAVEFORM

:ERO CROSSINGRATE 2-3ENERGY %NVELOPE !TTACK3USTAIN2ELEASE