Deep Impression: Audiovisual Deep Residual Networks for Multimodal Apparent Personality Trait Recognition

Here, we develop an audiovisual deep residual network for multimodal apparent personality trait recognition. The network is trained end-to-end for predicting the Big Five personality traits of people from their videos. That is, the network does not requir

  • PDF / 595,926 Bytes
  • 10 Pages / 439.37 x 666.142 pts Page_size
  • 9 Downloads / 217 Views

DOWNLOAD

REPORT


ct. Here, we develop an audiovisual deep residual network for multimodal apparent personality trait recognition. The network is trained end-to-end for predicting the Big Five personality traits of people from their videos. That is, the network does not require any feature engineering or visual analysis such as face detection, face landmark alignment or facial expression recognition. Recently, the network won the third place in the ChaLearn First Impressions Challenge with a test accuracy of 0.9109. Keywords: Big five personality traits · Audiovisual network · Deep residual network · Multimodal

1

·

Deep neural

Introduction

Appearances influence what people think about the personality of other people, even without having any interaction with them. These judgments can be made very quickly - already after 100 ms [35]. Although some studies have shown that people are good at forming accurate first impressions about the personality traits of people after viewing their photographs or videos [4,21], it has also been shown that simply relying on the appearance does not always result in correct first impression judgments [22]. Several characteristics of people varying from clothing to facial expressions, contribute to the first impression judgments about personality [29]. For example, [30] has shown that the photographs of the same person taken with a different facial expression changes the judgments about the person’s personality traits such as trustworthiness and extravertedness as well as other perceived characteristics such as attractiveness and intelligence. Furthermore, people are better at guessing other’s personality traits if they find them attractive after short encounters with them [18]. The same study also showed that people form more positive first impressions about more attractive people. Studies of personality prediction generally either deal with correctly recognizing the actual personality traits of people, which can be measured as c Springer International Publishing Switzerland 2016  G. Hua and H. J´ egou (Eds.): ECCV 2016 Workshops, Part III, LNCS 9915, pp. 349–358, 2016. DOI: 10.1007/978-3-319-49409-8 28

350

Y. G¨ u¸cl¨ ut¨ urk et al.

self- or acquaintance-reports or apparent personality traits, which are the impressions about the personality of an unfamiliar individual [34]. Below we review the recent work in apparent personality prediction. Most of the previous work on apparent personality modeling and prediction have been in the domain of paralanguage, i.e. speech, text, prosody, other vocalizations and fillers [34]. Conversations (both text and audio) [19] and speech clips [20,23] were the materials that were most commonly analyzed. In this domain, INTERSPEECH 2012 Speaker Trait Challenge [25] enabled a systematic comparison of computational methods by providing a dataset comprising audio data and extracted features. The competition had three sub-challenges for predicting the Big Five personality traits, likability and pathology of speakers. Recently, prediction of apparent personality traits from soci