Visualizing image content to explain novel image discovery
- PDF / 3,446,856 Bytes
- 28 Pages / 439.37 x 666.142 pts Page_size
- 107 Downloads / 199 Views
Visualizing image content to explain novel image discovery Jake H. Lee1,2 · Kiri L. Wagstaff2 Received: 14 August 2019 / Accepted: 18 June 2020 © The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2020
Abstract The initial analysis of any large data set can be divided into two phases: (1) the identification of common trends or patterns and (2) the identification of anomalies or outliers that deviate from those trends. We focus on the goal of detecting observations with novel content, which can alert us to artifacts in the data set or, potentially, the discovery of previously unknown phenomena. To aid in interpreting and diagnosing the novel aspect of these selected observations, we recommend the use of novelty detection methods that generate explanations. In the context of large image data sets, these explanations should highlight what aspect of a given image is new (color, shape, texture, content) in a human-comprehensible form. We propose DEMUD-VIS, the first method for providing visual explanations of novel image content by employing a convolutional neural network (CNN) to extract image features, a method that uses reconstruction error to detect novel content, and an up-convolutional network to convert CNN feature representations back into image space. We demonstrate this approach on diverse images from ImageNet, freshwater streams, and the surface of Mars. Finally, we evaluate the utility of the visual explanations with a user study. Keywords Novelty detection · Explanations · Image analysis
Responsible editor: Pierre Baldi
B
Kiri L. Wagstaff [email protected] Jake H. Lee [email protected]
1
Columbia University, New York, USA
2
Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, CA 91109, USA
123
J. H. Lee, K. L. Wagstaff
1 Introduction Increases in computational power and data analysis capabilities have inspired a corresponding increase in the appetite for data collection. Data sets collected by scientific, industrial, financial, and political efforts continue to grow in scale and complexity. Comprehending the contents of these data sets becomes challenging as the number of items increases to thousands, millions, or more. Data sets that consist of images can be particularly problematic: while humans are very good at image understanding, no human can feasibly scan through and comprehend a collection of millions of images. Automated machine learning methods can organize and prioritize image collections to make the best use of limited human attention and time. Classification methods can identify members of known classes, enabling humans to quickly zero in on images of interest. Unsupervised methods enable exploration of data sets where the classes may not yet be known. For example, clustering methods can identify common groups or trends within the data set. Our focus is on a complementary discovery task: highlighting novelties or anomalies within the data set. Methods that identify novel obse
Data Loading...