Speech Analysis for Mental Health Assessment Using Support Vector Machines

Speech and language dysfunction (SLD) is one of the primary symptoms of mental disorders, such as schizophrenia. Because of the difficulties and subjective nature of SLD assessments, their use in clinical assessment of mental health problems has been limi

  • PDF / 997,230 Bytes
  • 27 Pages / 439.37 x 666.142 pts Page_size
  • 25 Downloads / 193 Views

DOWNLOAD

REPORT


Speech Analysis for Mental Health Assessment Using Support Vector Machines Insu Song and Joachim Diederich

5.1 Introduction Common measures of Speech and Language disorder (SLD) are observer-rated scales, such as TLC and CLANG, as they provide for a broad assessment of symptoms of mental health problems, such as schizophrenia [1, 2]. Common SLD rating items include phenomenological assessments (e.g., poverty-of-speech) and linguistic assessments (e.g., excess phonetic-association). These observer-rated scales are subjective in nature and require intensive human effort. In addition, inter-rater reliability of these scales is also a problem, as it is time consuming to establish and assess. Therefore, the use of observer-rated scales in clinical assessment of mental health conditions has been limited. Automated speech and language assessment methods (e.g., [3]) have shown a possibility of providing an accurate discrimination of groups. In this chapter, we evaluate the use of N-gram concept features and Support Vector Machines (SVMs) for predicting the following SLD assessment items automatically from speech samples: the Thought, Language and Communication (TLC) rating items [1] and the Clinical Language Disorder Rating Scale (CLANG) rating items [2]. TLC has 18 items for evaluating schizophrenia speech, such as Poverty of Speech and Loss of Goal. CLANG has 17 items, such as Excess Phonetic Association and Abnormal Syntax. The sum of the prediction value of each of the items was then used to predict the underlying mental health condition. This is a two-level hierarchical classifier that predicts specific SLD items (e.g., poverty of speech) at the first level via a set of SVM classifiers and provides the

I. Song (&) School of Business and IT, James Cook University, Singapore Campus, Singapore 574421, Singapore e-mail: [email protected] J. Diederich School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, QLD 4072, Australia

M. Lech et al. (eds.), Mental Health Informatics, Studies in Computational Intelligence 491, DOI: 10.1007/978-3-642-38550-6_5,  Springer-Verlag Berlin Heidelberg 2014

79

80

I. Song and J. Diederich

final decision at the second level by combining the results of the first level. Importantly, the intermediate results (predictions based on the SLD items) at the first level serve as explanations of the final decision. The TLC and CLANG items are evaluated on 5 or 4 point Likert scales, but for our purpose the scales were converted to 2 point scales: -1 for no point and +1 for any points in the original ratings. The transcribed speech samples of 46 participants (19 schizophrenia patients and 27 controls) were used to evaluate the N-gram concept features and SVM classifiers in predicting TLC and CLANG ratings. For each item, we applied C-SVC (Soft-Margin SVM classifiers) for prediction: +1 for any points to an item (i.e., rating [ 0) and –1 for no points (i.e., rating = 0). Although our sample size was small, the results suggest the possibility of usin