High level feature extraction for the self-taught learning algorithm

PDF / 468,010 Bytes
11 Pages / 595 x 794 pts Page_size
74 Downloads / 212 Views

RE SE A RCH

Open Access

High level feature extraction for the self-taught learning algorithm Konstantin Markov1* and Tomoko Matsui2 Abstract Availability of large amounts of raw unlabeled data has sparked the recent surge in semi-supervised learning research. In most works, however, it is assumed that labeled and unlabeled data come from the same distribution. This restriction is removed in the self-taught learning algorithm where unlabeled data can be diﬀerent, but nevertheless have similar structure. First, a representation is learned from the unlabeled samples by decomposing their data matrix into two matrices called bases and activations matrix respectively. This procedure is justiﬁed by the assumption that each sample is a linear combination of the columns in the bases matrix which can be viewed as high level features representing the knowledge learned from the unlabeled data in an unsupervised way. Next, activations of the labeled data are obtained using the bases which are kept ﬁxed. Finally, a classiﬁer is built using these activations instead of the original labeled data. In this work, we investigated the performance of three popular methods for matrix decomposition: Principal Component Analysis (PCA), Non-negative Matrix Factorization (NMF) and Sparse Coding (SC) as unsupervised high level feature extractors for the self-taught learning algorithm. We implemented this algorithm for the music genre classiﬁcation task using two diﬀerent databases: one as unlabeled data pool and the other as data for supervised classiﬁer training. Music pieces come from 10 and 6 genres for each database respectively, while only one genre is common for the both of them. Results from wide variety of experimental settings show that the self-taught learning method improves the classiﬁcation rate when the amount of labeled data is small and, more interestingly, that consistent improvement can be achieved for a wide range of unlabeled data sizes. The best performance among the matrix decomposition approaches was shown by the Sparse Coding method. Introduction A tremendous amount of music-related data has recently become available either locally or remotely over networks, and technology for searching this content and retrieving music-related information eﬃciently is demanded. This consists of several elemental tasks such as genre classiﬁcation, artist identiﬁcation, music mood classiﬁcation, cover song identiﬁcation, fundamental frequency estimation, and melody extraction. Essential for each task is the feature extraction as well as the model or classiﬁer selection. Audio signals are conventionally analyzed frame-byframe using Fourier or Wavelet transform, and coded as spectral feature vectors or chroma features extracted for several tens or hundreds of milliseconds. However, it is an open question how precisely music audio should be coded depending on the task kind and the succeeding classiﬁer. *Correspondence: [email protected] 1 Department of Information Systems, The University of Aizu, Fukushima, Japan Full list of author in

Data Loading...

High level feature extraction for the self-taught learning algorithm

Recommend Documents

GPU acceleration of the KAZE image feature extraction algorithm

Fast and robust key frame extraction method for gesture video based on high-level feature representation

Feature Extraction

Multi-level feature learning with attention for person re-identification

An image super-resolution deep learning network based on multi-level feature extraction module

Accelerated High-Level Synthesis Feature Detection for FPGAs Using HiFlipVX

Feature Extraction of Chinese Characters Based on ASM Algorithm

Image Matching Algorithm Based on Improved ORB Feature Extraction

Convergence Feature Extraction

Linear Feature Extraction

Audio Feature Extraction

Shape Feature Extraction