On machine vision and photographic imagination

  • PDF / 1,651,211 Bytes
  • 13 Pages / 595.276 x 790.866 pts Page_size
  • 49 Downloads / 200 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

On machine vision and photographic imagination Daniel Chávez Heras1 · Tobias Blanke2 Received: 1 May 2020 / Accepted: 14 October 2020 © The Author(s) 2020

Abstract In this article we introduce the concept of implied optical perspective in deep learning computer vision systems. Taking the BBC’s experimental television programme “Made by Machine: When AI met the Archive” (2018) as a case study, we trace a conceptual and material link between the system used to automatically “watch” the television archive and a specific type of photographic practice. From a computational aesthetics perspective, we show how deep learning machine vision relies on photography, its technical regimes and epistemic advantages, and we propose a novel way to identify the latent camera through which the BBC archive was seen by machine. Keywords  Computational aesthetics · Philosophy of photography · AI television · Computer vision · Deep learning · Dataset archaeology

1 Introduction Is that a person or a reflection? A man or a woman? Is the woman holding a mobile phone, or is it, rather, the statue of an ancient Egyptian king? Is the man wearing a shirt, or is it an elephant? Or a stuffed animal holding a banana…? (Figs. 1, 2, 3). These are some of the mislabellings produced when a small team of technologists and researchers set a computer vision system to “watch” thousands of hours of British television for the project “Made by Machine: When AI met the Archive”(MbM), whose outputs were eventually packaged and broadcast on BBC Four as an experimental “AI TV” programme in 2018.1 In line with the public purposes of the British broadcaster (BBC 2018), one of the main goals of the programme was to show to a wider audience some of the possibilities and limitations of AI, and in particular deep learning approaches that underlie many contemporary computer vision systems. From a research perspective, the project was also designed as prompt to explore just how exactly computers are said to be “seeing”. What type of knowledge is produced by computer * Daniel Chávez Heras [email protected] 1



Department of Digital Humanities, King’s College London, London, UK



University of Amsterdam, Amsterdam, The Netherlands

2

vision and how does it inform the ways we understand and give currency to audio–visual media more generally? In related work, such questions have generally been approached by focussing on training datasets and how they are assembled as well as how the resulting AI systems represent or fail to represent different sectors of society. Exemplary of this approach are the works of Kate Crawford and Adam Harvey: “Training sets, then, are the foundation on which contemporary machine-learning systems are built. They are central to how AI systems recognize and interpret the world. These datasets shape the epistemic boundaries governing how AI systems operate, and thus are an essential part of understanding socially significant questions about AI.” (Crawford and Paglen 2019) “A photo is no longer just a photo when it can a