Low-complexity disordered speech quality estimation

PDF / 1,391,470 Bytes
10 Pages / 595.276 x 790.866 pts Page_size
111 Downloads / 204 Views

Low‑complexity disordered speech quality estimation Yousef S. Ettomi Ali1 · Vijay Parsa1,2 · Phillip Doyle2 · Soulaimane Berkane3 Received: 11 June 2019 / Accepted: 11 February 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020, corrected publication 2020

Abstract Tracheoesophageal (TE) speech is generated by patients who have undergone a total laryngectomy where the larynx (voice box) is removed and replaced by a tracheoesophageal puncture. This work presents a novel low complexity algorithm to estimate the degree of severity of disordered TE speech. The proposed algorithm has two output scores which are computed from 20 ms voiced frames of the speech signal. An 18th order Linear Prediction (LP) analysis is performed on each voiced frame of the speech signal. The first output score uses features derived from high order statistics (mean, variance, skewness and kurtosis) which are calculated from the LP coefficients, the cepstral coefficients and the LP residual signal. These high order statistics (HOS) along with the pitch value are averaged over all voiced frames yielding a total of 14 HOS quality features. The second output score is derived from features derived from the estimated vocal tract model parameters (crosssectional tubes areas). Statistical vocal tract parameters (VTPs) across all voiced speech frames were used as speech quality features. Forward stepwise regression as well as K-fold cross validation are then used to select the best sets of features to be fed to the regression models. The results show high correlations with subjective scores for several regression techniques that can provide a correlation up to 0.91 when VTP-Gaussian model is used. Keywords Tracheoesohageal speech · Speech quality · Linear prediction · Vocal tract parameters

1 Introduction Voice and speech quality estimation is an important topic of research with many applications in telecommunication and biomedical engineering. Early algorithms that assesses voice and speech quality were developed in the telecommunication industry to evaluate the performance of telecommunication channels, the accuracy of speech coding algorithms * Yousef S. Ettomi Ali [email protected] Vijay Parsa [email protected] Phillip Doyle [email protected] Soulaimane Berkane [email protected] 1

Department of Electrical and Computer Engineering, University of Western Ontario, London, ON, Canada

2

School of Communications and Speech Disorders, University of Western Ontario, London, ON, Canada

3

Department of Computer Sciences and Engineering, University of Quebec in Outaouais, Gatineau, QC, Canada

and often the efficiency of speech enhancement methods (Union 1996; Rix et al. 2001; Malfait et al. 2006; Beerends et al. 2013). In the biomedical field, voice and speech quality estimation algorithms were developed to evaluate the severity of dysphonia (abnormality in the pereived quality of voice production) (Awan et al. 2010) and the associated voice quality of pathological speech (Parsa and Jamieson 2001; Ritchings et al. 2002; Gu et al

Data Loading...

Low-complexity disordered speech quality estimation

Recommend Documents

Correction to: Low-complexity disordered speech quality estimation

Disordered Speech Assessment Using Automatic Methods Based on Quantitative Measures

Non-intrusive speech quality prediction based on the blind estimation of clean speech and the i-vector framework

E-Model Parameters Estimation for VoIP with Non-ITU Codec Speech Quality Prediction

Real-Time, Non-intrusive Speech Quality Estimation: A Signal-Based Model

Speech Enhancement by MAP Spectral Amplitude Estimation Using a Super-Gaussian Speech Model

Speech Prosody in Speech Synthesis: Modeling and generation of prosody for high quality and flexible speech synthesis

Multichannel Direction-Independent Speech Enhancement Using Spectral Amplitude Estimation

A novel BNMF-DNN based speech reconstruction method for speech quality evaluation under complex environments

Innovative Analysis for Parameter Estimation Quality

Advanced Feedforward-and-Feedback Decorrelation Algorithms for Speech Quality Enhancement

Neural Correlates of Quality Perception for Complex Speech Signals