Emotional quantification of soundscapes by learning between samples

PDF / 1,253,795 Bytes
9 Pages / 439.642 x 666.49 pts Page_size
67 Downloads / 265 Views

Emotional quantiﬁcation of soundscapes by learning between samples Stavros Ntalampiras1 Received: 9 March 2020 / Revised: 7 July 2020 / Accepted: 27 July 2020 / © The Author(s) 2020

Abstract Predicting the emotional responses of humans to soundscapes is a relatively recent field of research coming with a wide range of promising applications. This work presents the design of two convolutional neural networks, namely ArNet and ValNet, each one responsible for quantifying arousal and valence evoked by soundscapes. We build on the knowledge acquired from the application of traditional machine learning techniques on the specific domain, and design a suitable deep learning framework. Moreover, we propose the usage of artificially created mixed soundscapes, the distributions of which are located between the ones of the available samples, a process that increases the variance of the dataset leading to significantly better performance. The reported results outperform the state of the art on a soundscape dataset following Schafer’s standardized categorization considering both sound’s identity and the respective listening context. Keywords Acoustic ecology · Audio signal processing · Afffective computing

1 Introduction The field aiming at assessing the emotional content of generalized sounds including speech, music and sound events is attracting the interest of an ever increasing number of researchers [12, 15–17, 21, 25]. However, there is still a gap regarding works addressing the specific case of soundscapes, i.e. the combination of sounds forming an immersive environment [20]. Soundscape emotion prediction (SEP) focuses on the understanding of the emotions perceived by a listener of a given soundscape. These may comprise the necessary stimuli for a receiver to manifest different emotional states and/or actions, for example, one may feel joyful in a natural environment. Such contexts demonstrate the close relationship existing between soundscapes and the emotions they evoke, i.e., soundscapes may cause emotional manifestations on the listener side, such as joy. That said, SEP can have a significant impact

Stavros Ntalampiras

[email protected] 1

University of Milan, via Celoria 18, Milan, Italy

Multimedia Tools and Applications

in a series of application domains, such as sound design [18, 22], urban planning [3, 24], and acoustic ecology [4, 11], to name but a few. Affective computing has received a lot of attention [9] in the last decades with a special focus on the analysis of emotional speech, where a great gamut of generative and discriminative classifiers have been employed [21, 28], and music [7, 26] where most of the research is concentrated on regression methods. The literature analyzing the emotional responses to soundscape stimuli includes mainly surveys requesting listeners to characterize them. The work described in [1] details such a survey aiming to analyze soundscapes categorized as technological, natural or human. Davies et al. [3] provide a survey specifically designed to assess various emotion

Data Loading...

Emotional quantification of soundscapes by learning between samples

Recommend Documents

Emotional Dimensions of Learning

Emotional Factors in Learning

Socio-emotional Aspects of Learning

Emotional and Cognitive Learning

Quantification of Ammonia Oxidizing Bacterial Abundances in Environmental Samples by Quantitative-PCR

Quantification by SRM-MS

Relationship between Emotional Labor and Job Satisfaction: Testing Mediating Role of Emotional Intelligence on South Kor

Relationships between emotional climate and the fluency of classroom interactions

Non-deterministic and emotional chatting machine: learning emotional conversation generation using conditional variation

Quantification of Myocardial Blood Flow by Machine Learning Analysis of Modified Dual Bolus MRI Examination

Quantification of Influences Between Components, Functions and Process Usage Stages by Linking TRIZ Methods and Systems

From Sinewaves to Physiologically-Adaptive Soundscapes: The Evolving Relationship Between Sound and Emotion in Video Gam