Deep Learning the City: Quantifying Urban Perception at a Global Scale

Computer vision methods that quantify the perception of urban environment are increasingly being used to study the relationship between a city’s physical appearance and the behavior and health of its residents. Yet, the throughput of current methods is to

  • PDF / 2,637,510 Bytes
  • 17 Pages / 439.37 x 666.142 pts Page_size
  • 95 Downloads / 308 Views

DOWNLOAD

REPORT


Indian Institute of Technology, Delhi, India [email protected] 2 Virginia Tech, Blacksburg, USA [email protected] 3 MIT Media Lab, Cambridge, USA {naik,raskar,hidalgo}@mit.edu

Abstract. Computer vision methods that quantify the perception of urban environment are increasingly being used to study the relationship between a city’s physical appearance and the behavior and health of its residents. Yet, the throughput of current methods is too limited to quantify the perception of cities across the world. To tackle this challenge, we introduce a new crowdsourced dataset containing 110,988 images from 56 cities, and 1,170,000 pairwise comparisons provided by 81,630 online volunteers along six perceptual attributes: safe, lively, boring, wealthy, depressing, and beautiful. Using this data, we train a Siamese-like convolutional neural architecture, which learns from a joint classification and ranking loss, to predict human judgments of pairwise image comparisons. Our results show that crowdsourcing combined with neural networks can produce urban perception data at the global scale. Keywords: Perception

1

· Attributes · Street view · Crowdsourcing

Introduction

We shape our buildings, and thereafter our buildings shape us. – Winston Churchill. These famous remarks reflect the widely-held belief among policymakers, urban planners and social scientists that the physical appearance of cities, and it’s perception, impacts the behavior and health of their residents. Based on this idea, major policy initiatives—such as the New York City “Quality of Life Program”— have been launched across the world to improve the appearance of cities. Social scientists have either predicted or found evidence for the impact of the perceived unsafety and disorderliness of cities on criminal behavior [1,2], education [3], Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46448-0 12) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016  B. Leibe et al. (Eds.): ECCV 2016, Part I, LNCS 9905, pp. 196–212, 2016. DOI: 10.1007/978-3-319-46448-0 12

Deep Learning the City

197

health [4], and mobility [5], among others. However, these studies have been limited to a few neighborhoods, or a handful of cities at most, due to a lack of quantified data on the perception of cities. Historically, social scientists have collected this data using field surveys [6]. In the past decade, a new source of data on urban appearance has emerged, in the form of “Street View” imagery. Street View has enabled researchers to conduct virtual audits of urban appearance, with the help of trained experts [7,8] or crowdsourcing [9,10]. However, field surveys, virtual audits and crowdsourced studies lack both the resolution and the scale to fully utilize the global corpus of Street View imagery. For instance, New York City alone has roughly one million street blocks, which makes generating an exhaustive city-wide dataset of urban appearance a daunting task. Naturally, generat