Towards robust voice pathology detection

PDF / 728,088 Bytes
11 Pages / 595.276 x 790.866 pts Page_size
20 Downloads / 213 Views

(0123456789().,-volV)(0123456789().,-volV)

S.I.: ADVANCES IN BIO-INSPIRED INTELLIGENT SYSTEMS

Towards robust voice pathology detection Investigation of supervised deep learning, gradient boosting, and anomaly detection approaches across four databases Pavol Harar1

•

Zoltan Galaz1 • Jesus B. Alonso-Hernandez2 • Jiri Mekyska1 • Radim Burget1 • Zdenek Smekal1

Received: 10 January 2018 / Accepted: 24 March 2018 Ó The Natural Computing Applications Forum 2018

Abstract Automatic objective non-invasive detection of pathological voice based on computerized analysis of acoustic signals can play an important role in early diagnosis, progression tracking, and even effective treatment of pathological voices. In search towards such a robust voice pathology detection system, we investigated three distinct classifiers within supervised learning and anomaly detection paradigms. We conducted a set of experiments using a variety of input data such as raw waveforms, spectrograms, mel-frequency cepstral coefficients (MFCC), and conventional acoustic (dysphonic) features (AF). In comparison with previously published works, this article is the first to utilize combination of four different databases comprising normophonic and pathological recordings of sustained phonation of the vowel /a/ unrestricted to a subset of vocal pathologies. Furthermore, to our best knowledge, this article is the first to explore gradient-boosted trees and deep learning for this application. The following best classification performances measured by F1 score on dedicated test set were achieved: XGBoost (0.733) using AF and MFCC, DenseNet (0.621) using MFCC, and Isolation Forest (0.610) using AF. Even though these results are of exploratory character, conducted experiments do show promising potential of gradient boosting and deep learning methods to robustly detect voice pathologies. Keywords Voice pathology detection Deep learning Gradient boosting Anomaly detection

1 Introduction Voice pathology can be caused by the presence of tissue infection, systemic changes, mechanical stress, surface irritation, tissue changes, neurological and muscular changes, and other factors [59]. Due to vocal pathology, the mobility, functionality, and shape of the vocal folds are affected resulting into irregular vibrations and increased acoustic noise. Such a voice sounds strained, harsh, weak, & Pavol Harar [email protected] 1

Brno University of Technology, Technicka 3082/12, 61 600 Brno, Czech Republic

2

Institute for Technological Development and Innovation in Communications (IDeTIC), University of Las Palmas de Gran Canaria, Parque Cientı´fico Tecnolo´gico de la ULPGC, Polivalente II, Planta 2, 35017 Las Palmas de Gran Canaria, Spain

and breathy [27, 58], which significantly contributes to the overall poor voice quality [10, 38]. Up to this day, vocal pathology detection has been approached by subjective and objective evaluations [37]. The first category (subjective evaluation) consists of socalled in-hospital auditory-perceptual and visual examination of t

Data Loading...

Towards robust voice pathology detection

Recommend Documents

Correction to: Towards robust voice pathology detection

Lightweight CNN for Robust Voice Activity Detection

Robust Tree-Ring Detection

Robust Face Detection in Airports

Changing Attitudes Towards Voice Hearers: A Literature Review

Towards More Robust Detection for Small and Densely Arranged Ships in SAR Image

Voice liveness detection under feature fusion and cross-environment scenario

Robust Motion Detection in Real-Life Scenarios

Robust tensor subspace learning for anomaly detection

Voice

Wavelet sub-band features for voice disorder detection and classification

Voice-Activity and Overlapped Speech Detection Using x-Vectors