DWT and LPC based feature extraction methods for isolated word recognition

PDF / 217,882 Bytes
7 Pages / 595.28 x 793.7 pts Page_size
55 Downloads / 148 Views

RESEARCH

Open Access

DWT and LPC based feature extraction methods for isolated word recognition Navnath S Nehe1* and Raghunath S Holambe2

Abstract In this article, new feature extraction methods, which utilize wavelet decomposition and reduced order linear predictive coding (LPC) coefficients, have been proposed for speech recognition. The coefficients have been derived from the speech frames decomposed using discrete wavelet transform. LPC coefficients derived from subband decomposition (abbreviated as WLPC) of speech frame provide better representation than modeling the frame directly. The WLPC coefficients have been further normalized in cepstrum domain to get new set of features denoted as wavelet subband cepstral mean normalized features. The proposed approaches provide effective (better recognition rate), efficient (reduced feature vector dimension), and noise robust features. The performance of these techniques have been evaluated on the TI-46 isolated word database and own created Marathi digits database in a white noise environment using the continuous density hidden Markov model. The experimental results also show the superiority of the proposed techniques over the conventional methods like linear predictive cepstral coefficients, Mel-frequency cepstral coefficients, spectral subtraction, and cepstral mean normalization in presence of additive white Gaussian noise. Keywords: feature extraction, linear predictive coding, discrete wavelet transform, cepstral mean normalization, hidden Markov model

1. Introduction A speech recognition system has two major components, namely, feature extraction and classification. Feature extraction method plays a vital role in speech recognition task. There are two dominant approaches of acoustic measurement. First is a temporal domain or parametric approach such as linear prediction [1], which is developed to closely match the resonant structure of human vocal tract that produces the corresponding sound. Linear prediction coefficients (LPC) technique is not suitable for representing speech because it assumes signal stationary within a given frame and hence not analyze the localized events accurately. Also it is not able to capture the unvoiced and nasalized sounds properly [2]. Second approach is nonparametric frequency domain approach based on human auditory perception system and known as Mel-frequency cepstral coefficients (MFCC) [3]. The widespread use of the MFCCs is due * Correspondence: [email protected] 1 Department of Instrumentation Engineering, Pravara Rural Engineering College, Loni 413736, Maharashtra, India Full list of author information is available at the end of the article

to its low computational complexity and better performance for ASR under clean matched conditions. Performance of MFCC degrades rapidly in presence of noise and degradation is directly proportional to signal-tonoise ratio (SNR). Poor performance of LPC and its different forms like reflection coefficients, linear prediction cepstral coefficients (LPCC) as well as MFCC and its various f

Data Loading...

DWT and LPC based feature extraction methods for isolated word recognition

Recommend Documents

Isolated Word Automatic Speech Recognition System

EEG based emotion recognition using fusion feature extraction method

An LSTM-Based Fake News Detection System Using Word Embeddings-Based Feature Extraction

Feature Extraction for Content-Based Image Retrieval

Feature Extraction Methods for Real-Time Face Detection and Classification

Automatic Extraction and Welding Feature Recognition from STEP Data

Research on Fault Feature Extraction and Recognition of Rolling Bearings

A DCT-Based Local Dominant Feature Extraction Algorithm for Palm-Print Recognition

A new feature extraction process based on SFTA and DWT to enhance classification of ceramic tiles quality

Feature Extraction

SpPCANet: a simple deep learning-based feature extraction approach for 3D face recognition

Developing a Fuzzy Feature-Based Online Bengali Handwritten Word Recognition System