Multiclass support vector machines for environmental sounds classification in visual domain based on log-Gabor filters

PDF / 775,132 Bytes
11 Pages / 595.276 x 790.866 pts Page_size
8 Downloads / 231 Views

Multiclass support vector machines for environmental sounds classification in visual domain based on log-Gabor filters Souli Sameh · Zied Lachiri

Received: 8 May 2012 / Accepted: 11 August 2012 / Published online: 6 September 2012 © Springer Science+Business Media, LLC 2012

Abstract This paper presents an approach aimed at recognizing environmental sounds for surveillance and security applications. We propose a robust environmental sound classification approach, based on spectrograms features derive from logGabor filters. This approach includes three methods. In the first two methods, the spectrograms are passed through an appropriate log-Gabor filter banks and the outputs are averaged and underwent an optimal feature selection procedure based on a mutual information criteria. The third method uses the same steps but applied only to three patches extracted from each spectrogram. To investigate the accuracy of the proposed methods, we conduct experiments using a large database containing 10 environmental sound classes. The classification results based on Multiclass Support Vector Machines show that the second method is the most efficient with an average classification accuracy of 89.62 %. Keywords Environmental sounds · Visual features · Log-Gabor filters · Spectrogram · SVM multiclass

S. Sameh () Signal, Image and Pattern Recognition Research Unit, Dept. of Genie Electrique, ENIT, BP 37, 1002, Le Belvédère, Tunisia e-mail: [email protected] S. Sameh · Z. Lachiri Dept. of Physique and Instrumentation, INSAT, BP 676, 1080, Centre Urbain, Tunisia Z. Lachiri e-mail: [email protected]

1 Introduction The Automatic recognition of environmental sound is an important problem in audio domain. Generally, a variety of features have been proposed for audio recognition (Chu et al. 2009; Rabaoui et al. 2008) including different descriptors such as MFCCs, frequency roll-off, spectral centroid, zero-crossing, energy, Linear-Frequencies Cepstral Coefficients (LFCCs). These descriptors can be used as a combination of some, or even all, of these 1-D audio features together, but sometimes the combination between descriptors increases the classification performance compared with the individually-used features. The problem is that there are many features which negatively influenced the quality of classification. Therefore, the recognition rate decreases when the number of targeted classes increases because of the presence of some difficulties like randomness and high variance (Chu et al. 2009). Recently, some efforts have emerged in the new research direction, which demonstrate that the visual techniques can be applied in musical sounds (Yu and Slotine 2008). In order to explore the visual information of environmental sounds, our last work consists in integrating the audio texture concept as image textures (Souli and Lachiri 2011). Our goal has to develop an environmental sounds classification method, using advanced visual descriptors. The feature extraction method uses the structure time-frequency by means of translation-inv

Data Loading...

Multiclass support vector machines for environmental sounds classification in visual domain based on log-Gabor filters

Recommend Documents

Ensemble Approaches of Support Vector Machines for Multiclass Classification

Support Vector Machines for Pattern Classification

An Output Grouping Based Approach to Multiclass Classification Using Support Vector Machines

On Coresets for Support Vector Machines

Support Vector Machines

Support Vector Machines and Evolutionary Algorithms for Classification

Support Vector Machines

Support Vector Machines

Support Vector Machines

Minimal Complexity Support Vector Machines

A Classification Method of Land Cover Based on Support Vector Machines

Influence Diagnostics in Support Vector Machines